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Summary 


A  method  can  be  thought  of  as  a  distillation  of  good  practice  for  a  particular  system 
development  situation.  Formalization  of  a  successful  engineering,  management, 
production  or  support  technique  into  a  method  is  done  in  hopes  raising  the  performance 
of  the  novice  practitioner  to  a  level  comparable  with  that  of  an  expert  through  the 
appropriate  use  of  the  method.  Individual  methods  are  normally  accompanied  by  a  special 
purpose  graphical  language  that  serves  to  provide  focus  and  display  emphasis  for  the  major 
concepts  that  need  discovery,  concensus  or  decision  relative  to  a  specific  system 
development  life  cycle  activity.  Experience  has  proven  that  personal  and  organizational 
preferences  for  particular  methods  are  likely  to  continue  making  it  necessary  to  somehow 
isolate  the  information  gathered  and  displayed  by  one  method  such  that  it  can  be  used  in 
other  stages  of  the  life  cycle,  or  be  displayed  in  alternative  forms. 

This  paper  outlies  the  theoretical  foundations  necessary  to  construct  a  Neutral 
Information  Representation  Scheme  (NIRS)  which  will  allow  for  automated  data  transfer 
and  translation  between  model  languages,  procedural  programming  languages,  database 
languages,  transaction  and  process  languages,  as  well  as  knowledge  representation  and 
reasoning  control  languages  for  information  system  specification. 
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Introduction 


This  document  presents  the  theoretical  foundations  for  information  repre¬ 
sentation  languages  of  both  graphical  and  textual  varieties.  It  is  intended 
to  serve  as  a  framework  for  providing  rigorous  syntax  and  semantics  of  ex¬ 
isting  and  proposed  information  analysis,  design,  and  engineering  methods. 
The  purpose  of  such  a  framework  is  to  provide  information  representation 
language  designers  with  the  guidance  necessary  to  allow  for  automated  inter- 
model  data  transfer  and  translation.  Thus,  this  document  should  be  viewed 
as  the  structure  for  an  information  model  data  exchange  specification.  Fi¬ 
nally  this  theory  is  motivated  by  the  need  for  a  general  theory  of  information 
representation.  Thus,  this  theory  serves  as  the  first  step  towards  achievement 
of  a  Neutral  Information  Representation  Scheme  (NIRS)  for  an  Integrated 
Development  Support  Environment  (IDSE)  that  can  serve  as  the  platform  for 
a  seamless  Computer  Aided  Softward  Engineering  (CASE)  environment.  Sec¬ 
tion  1  of  this  document  describes  the  motivations  and  considerations  behind 
the  proposed  theory.  Section  2  introduces  a  restricted  first-order  language 
syntax  that  is  proposed  as  the  bounding  syntactic  structure  for  informa¬ 
tion  modeling  languages.  Section  3  provides  a  model  theoretic  semantics  for 
those  languages,  and  Section  4  a  corresponding  logic.  Section  5  describes  the 
application  of  these  concepts  to  constraint  languages. 

1  Motivation 

The  Air  Force  Integrated  Information  Systems  Evolution  Environment 
(USEE)  project  represents  a  comprehensive  research  effort  to  develop  tech¬ 
nologies  critical  to  effectively  manage,  control,  and  exploit  information  as 
a  resource.  The  resulting  developments  will  provide  integration  support 
methodologies,  frameworks,  and  experimental  tools  to  support  integrated 
information  management  systems  development  and  evolution. 

One  of  the  key  premises  on  which  this  program  is  based  is  the  recogni¬ 
tion  of  the  need  for  a  suite  of  information  modeling  methods  to  service  the 
large  number  of  tasks  and  user /developer  roles  in  an  evolutionary  integrated 
information  system  development  process.  Each  method  in  this  suite  is  de- 
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signed  to  serve  a  particular  class  of  human  users  performing  specific  tasks  or 
decision  processes.  The  individual  methods  normally  are  accompanied  by  a 
special  purpose  graphical  language  that  serves  to  provide  focus  and  display 
emphasis  for  the  major  concepts  that  need  discovery,  consensus,  or  decision 
relative  to  that  task.  The  problem  with  this  approach  is  that  these  syntactic 
features  restrict  the  information  that  can  be  stated  in  the  language. 

The  seamless  CASE  concept  is  focused  on  development  of  the  technologi¬ 
cal  components  and  management  methods  for  seamless  software  engineering 
environments.  The  term  “seamless”  is  meant  to  convey  the  integrated  nature 
of  the  methods  and  tools  provided  to  the  software  implementer.  The  plural- 
ization  of  the  term  “environments”  is  meant  to  convey  the  fact  that  different 
seamless  case  environments  will  be  defined  for  different  software  types. 

This  particular  document  is  the  result  of  research  which  began  as  an  ef¬ 
fort  to  define  a  constraint  specification  language  for  a  particular  information 
modeling  method  known  as  EDEF1.1  An  overview  of  the  method  and  its  for¬ 
malization  are  found  in  Appendices  A  and  B.  As  the  effort  progressed,  it  was 
recognized  that  the  emerging  language  structures  were  similar  to  those  being 
investigated  for  the  conceptual  schema  representation  language  for  the  IDSE 
seamless  CASE  environment  and  for  the  Neutral  Information  Representation 
Scheme  to  be  used  to  provide  the  basis  for  an  evolving  system  description 
capable  of  supporting  automated  knowledge  based  model  translation.  The 
theory  presented  in  this  report  has  been  used  as  the  formal  foundation  for  a 
family  of  languages  that  will  serve  the  above  described  purposes.  This  family 
of  Information  System  constraint  languages  (ISyCL)  is  described  in  [1]. 

2  First-order  Languages 

The  basis  of  our  account  will  be  the  notion  of  a  first-order  language.  First- 
order  languages  are  flexible,  expressively  quite  rich,  and  extremely  well  un¬ 
derstood.  They  are  used  extensively  in  mathematics,  linguistics,  philosophy, 

lSee,  e.g.,  [2]  and  [3],  “IDEF"  was  originally  an  acronym  for  “1CAM  Definition  Lan¬ 
guage,”  but  the  suite  of  IDEF  methods  has  since  evolved  independently  of  its  ICAM 
origins.  Hence,  like  “NCR”  (formerly  an  acronym  for  “National  Cash  Register”),  IDEF” 
is  now  simply  a  name  like  “George,”  and  an  acronym  no  longer. 


and  computer  science  whenever  clarity  of  expression  is  especially  important. 
Many  familiar  mathematical  theories  such  as  the  theory  of  sets,  boolean  alge¬ 
bra,  topology,  etc.,  can  be  elegantly  expressed  in  first-order  terms.  More  re¬ 
cently  first-order  languages  have  found  their  way  into  the  domain  of  artificial 
intelligence,  where  first-order  languages  find  straightforward  representation 
in  familiar  AI  programming  languages  like  LISP  and  PROLOG.  Indeed,  first- 
order  mathematical  logic  is  the  formal  foundation  of  PROLOG-an  acronym 
for  PROgramming  in  LOGic.2) 

Generally  speaking,  a  first-order  language  £  is  a  formal  language.  That 
is,  it  is  a  formal  structure  consisting  of  a  fixed  set  of  basic  symbols,  often 
called  the  vocabulary  of  £,  and  a  precise  set  of  syntactic  rules,  its  grammar , 
for  building  up  the  proper  sentences,  or  formulas ,  of  the  language  that  are 
capable  of  bearing  information. 

2.1  Vocabulary 

The  basic  vocabulary  of  a  first-order  language  consists  of  several  kinds  of 
symbols: 

•  Constants 

•  Variables 

•  Function  symbols 

•  Predicates 

•  Logical  symbols. 

Constants  are  symbols  that  correspond  to  names  in  ordinary  language. 
For  many  purposes,  it  is  useful  to  use  abbreviations  of  names  straight  out  of 
ordinary  language  for  constants,  e.g.,  j  for  John,  wp  for  Wright-Patterson,  v 
for  Venus,  o  for  Ohio,  etc.  When  we  are  describing  languages  in  general  and 
have  no  specific  application  in  mind,  we  will  simply  use  the  letters  a,  6,  c,  and 
d,  perhaps  with  subscripts;  we  will  assume  that  we  will  add  no  more  than 

2 See,  e.g.,  (4). 
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finitely  many  subscripted  constants  to  our  language.3  Constants  are  usu¬ 
ally  lower  case  letters,  with  or  without  subscripts,  but  this  is  not  necessary. 
Indeed,  it  is  often  useful  to  use  upper  case. 

We  will  often  want  to  say  things  abo  it  an  “arbitrary”  constant  as  a  way 
ol  talking  about  all  constants,  much  as  one  might  talk  about  an  arbitrary 
triangle  ABC  in  geometry  as  a  way  of  proving  something  about  all  triangles 
in  general.  For  this  purpose  it  will  not  do  to  *alk  specifically  about  a  given 
constant,  a  say,  since  we  want  what  we  say  to  apply  to  all  constants  generally. 
This  requires  that,  when  we  are  talking  about  our  language,  we  use  special 
metavariables  whose  roles  are  to  ser*e  as  placeholders  for  arbitrary  constants 
of  our  language,  much  as  UABC ”  above  serves  as  a  placeholder  for  arbitrary 
triangles.  Thus,  metavariables  js  not  themselves  part  of  our  first-ord.  r 
language  £,  but  rather  part  of  the  extended  English  wr  are  using  to  talk 
about  the  constants  that  are  in  the  language.  We  will  use  the  lower  cese 
sans  serif  characters  a,  b,  c  for  this  purpose. 

Next  are  the  variables,  whose  purpose  will  be  clarified  in  detail  below. 
The  lower  case  letters  x,  y ,  and  z,  possibly  with  subscripts,  will  play  this 
role,  and  we  will  suppose  there  to  be  an  unlimited  store  of  them.  We  will 
use  the  characters  x,  y  and  z  as  metavariables  over  the  store  of  variables  in 
our  language. 

Third,  we  have  function  symbols.  These  symbols  correspond  most  closely 
in  natural  language  to  expressions  of  the  form  “The  X  of,”  where  X  is  a 
common  noun  phrase  like  “color,”  “yearly  salary,”  “mother,”  etc.,  or  expres¬ 
sions  of  the  form  “The  Y-est  X  in,”  where  Y  is  an  adjective  like  “smart”  or 
“mean,”  and  X  once  again  by  a  common  noun  phrase.  Common  noun  phrases 
typically  express  general  properties.  For  any  common  noun  phrase  CNP,  the 
result  of  replacing  X  with  CNP  in  either  of  the  above  forms  (together  with 
an  adjective  for  Y  in  the  second  form)  intuitively  names  a  function  /  that, 
when  applied  to  a  given  object  a,  yields  the  appropriate  instance  /(a)  of  the 
property  expressed  by  the  CNP  for  that  object.  Thus,  where  X  is  “color,” 
the  resulting  function  in  the  first  form  yields  the  color  if  the  object  to  which 
it  is  applied;  where  it  is  “yearly  salary,”  the  resultirg  function  yields  an  ap- 

8The  restriction  to  a  finite  number  of  constants  fere  is  not  at  all  essential,  but  constraint 
languages  in  general  will  use  only  finitely  many;  the  same  holds  for  predicates  and  function 
names  below. 


propriate  dollar  amount.  Similarly,  “The  smartest  woman  in”  expresses  a 
function  that  takes  piaces — e.g.,  cities,  universities,  etc. — and  yields  for  each 
such  place  the  smartest  woman  therein. 

For  the  m  st  part  we  will  confine  our  attention  to  “one-place”  functions 
such  as  those  aoove  that  take  a  single  object  to  another  obje  t.  But  as  we 
will  see  there  are  occasions  wrhen  we  will  want  to  represent  functions  of  more 
than  one  argument  as  well.  Examples  of  expressions  that  stand  for  two- place 
functions  are  “The  only  child  of  . . .  and  ...”  and  “The  sum  of  . . .  and  . . . .” 
Intuitively,  the  former  expresses  a  partial  function4  from  couples  with  a  single 
chi’J  to  that  child,  and  the  latter  simply  expresses  the  addition  function, 
which  takes  two  given  numbers  to  a  further  number,  viz.,  their  sum. 

As  with  constants,  in  practice  it  is  often  convenient  to  abbreviate  relevant 
ordinary  language  functional  expressions  in  defining  the  function  symbols  of 
a  formal  language.  Again,  we  will  use  the  letters  /,  g,  and  h,  possibly  with 
subscripts,  for  our  basic  function  symbols,  and  coi responding  sans  senf  char¬ 
acters  as  metavariables.  Function  symbols  designed  to  stand  for  functions 
of  more  than  one  argument  will  be  indicated  with  an  appropriate  numerical 
superscript.  As  above,  we  will  suppose  there  are  only  finit^y  many  of  these 
symbols  in  ojr  language. 

We  also  introduce  the  symbol  •,  and  stipulate  that  where  g  stands  for 
any  n-place  function  symbol  in  our  language,  and  f  stands  for  any  one- place 
function  symbol,  then  f  •  g  is  an  n-place  function  symbol  as  well.  This 
corresponds  in  ordinary  language  to  the  fact  that  we  can  nest  functional 
expressions,  e.g.,  "The  salary  of  the  father  of  the  smartest  woman  in  largest 
university  in  . . .  ,”  or  “The  successor  of  the  sum  of  . . .  and  ...” 

The  fourth  group  of  symbols  in  our  language  consists  of  n-place  predi¬ 
cates  n  >  l.  One-place  predicates  correspond  roughly  to  verb  phrases  like 
“is  f  computer  scientist,”  “has  insomnia,,”  “is  an  employee,”  and  so  forth, 
all  of  which  express  properties  Two-place  predicates  correspond  roughly  to 
transi'.ive  verbs  like  “loves,”  “is  an  element  of,”  “is  less  than,”  “Legat,”  and 

*l.e.,  a  function  that  might  not  be  defined  on  every  element  of  its  domain.  E.g.,  the 
square  root  function  is  only  a  partial  function  on  the  natural  numbers,  since  it  is  not 
defined  on  those  numbers  which  are  not  squares  of  other  numbers.  The  function  in  ?he 
text  here  is  partial  because  its  intuitive  domain  is  the  set  of  pairs  of  humans,  and  not 
every  such  pair  has  a  single  child. 


“lives  with,”  which  express  two-place  relations  between  things.  There  are 
also  three-place  relations,  such  as  those  expressed  by  “gives”  and  “between,” 
and  with  a  little  work  we  could  come  up  with  relations  of  more  than  three 
places,  but  in  practice  we  shall  have  little  cause  to  go  much  beyond  this. 

We  will  use  upper  case  roman  letters  such  as  P,  Q,  ard  R  for  predicates, 
and  again  corresponding  sans  serif  characters  as  metavariables  over  pred¬ 
icates.  Occasionally  predicates  will  appear  with  numerical  superscripts  to 
indicate  the  number  of  places  of  the  relation  they  represent,  and  if  necessary 
with  subscripts  to  distinguish  those  with  the  same  superscripts.  It  is  often 
useful  to  abbreviate  relevant  natural  language  expressions.  Most  languages 
contain  a  distinguished  predicate  for  the  two-place  relation  “is  identical  to.” 
We  will  use  the  symbol  %  for  this  purpose. 

To  drive  home  the  difference  at  this  point  between  predicates  and  func¬ 
tion  symbols,  note  that  a  function  symbol  combines  with  names  to  yield  yet 
another  name-like  (i.e.,  referring)  expression:  e.g.,  to  draw  on  ordinary  lan¬ 
guage,  the  function  symbol  “the  husband  of”  combines  with  the  name  “Di” 
to  yield  the  new  referring  expression  (or  definite  description ,  as  such  are  of¬ 
ten  called)  “the  husband  of  Di.”  On  the  other  hand,  a  (one-place)  predicate 
combines  with  a  name  to  form  a  sentence,  something  that  can  be  true  or 
false,  not  a  name-like  expression.  Thus,  the  predicate  expression  “is  happy” 
combines  with  the  name  “Di”  to  yield  the  sentence  “Di  is  happy.”  The  same 
is  easily  seen  to  hold  for  n-place  predicates  generally. 

The  last  group  of  symbols  consists  of  the  basic  logical  symbols:  -1,  A,  V, 
D,  =,  the  existential  quantifer  3,  and  the  universal  quantifier  V,  about  which 
we  shall  have  more  to  say  shortly.  We  will  also  need  parentheses  and  perhaps 
other  grouping  indicators  to  prevent  ambiguity. 


2.2  Grammar 

Now  that  we  have  our  basic  symbols,  we  need  to  know  how  to  combine  them 
into  grammatical  units,  or  well-formed  formulas ,  the  formal  correlates  of  sen¬ 
tences.  These  will  be  the  expressions  that  can  encode  the  sort  of  information 
we  will  want  to  express  in  our  theory  (and  more).  This  is  done  recursively 
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as  follows.6 

First,  we  want  to  group  all  name- like  objects  into  a  single  category  known 
as  terms.  This  group  will  of  course  include  the  constants,  and  for  reasons 
below,  it  will  include  the  variables  as  well.  But  recall  the  discussion  of 
function  symbols  above.  There  we  saw  that  an  expression  like  “The  yearly 
salary  of”  seems  to  name  a  function  on  objects.  But  the  values  of  functions 
are  objects  as  well.  Thus,  when  we  attach  a  name,  “Fred,”  say,  to  the 
functional  expression  above,  the  result — “The  yearly  salary  of  Fred” — is  a 
sort  of  name  for  FYed”s  yearly  salary.  Thus,  we  count  the  result  of  attaching 
a  functional  symbol  to  an  appropriate  number  of  constants  and/or  variables 
as  a  term  as  well;  and  such  terms  can  also  be  among  the  terms  that  a  function 
symbol  attaches  to.  Thus,  more  exactly,  letting  tj  ,t2 , . . .  stand  for  arbitrary 
terms  and  f  stand  for  an  arbitrary  function  symbol,  if  tl5...,tn  are  terms 
and  f  is  an  n- place  function  symbol,  then  f(tj,. . .  ,t„)  is  a  term  as  well. 

Terms  formed  out  of  certain  familiar  two-place  function  symbols,  exam¬ 
ples  of  which  will  be  introduced  below,  are  more  commonly  written  in  m- 
fix  notation,  rather  than  the  prefix  notation  just  defined,  with  the  function 
symbol  flanked  by  the  two  terms,  rather  than  preceding  them.  Thus,  for 
a  two- place  function  symbol  f  and  terms  t,t',  the  term  f{t,t')  can  also  be 
written  as  tft'.  So,  for  example,  +(2,3)  can  be  written  as  2  +  3. 

Next  we  define  the  basic  formulas  of  our  language.  Just  as  verb  phrases 
and  transitive  verbs  in  ordinary  language  combine  with  names  to  form  sen¬ 
tences,  so  in  our  formal  language  predicates  combine  with  terms  to  form 
formulas.  Specifically,  if  P  is  any  n-place  predicate,  and  ti,...,t„  axe  any 
n  terms,  then  Ptj  . . .  t„  is  a  formula,  and  in  particular  an  atomic  formula. 
To  illustrate  this,  if  H  abbreviates  the  verb  phrase  “is  happy,”  and  a  the 
name  “Annie,”  then  the  formula  Ha  expresses  the  proposition  that  Annie 
is  happy.  Again,  if  L  abbreviates  the  verb  “loves,”  b  the  name  “Bob,”  c 
the  name  “Charlie,”  and  /  the  expression  “the  fiance  of,”  then  the  formula 
Lbf(c)  expresses  the  proposition  that  Bob  loves  Charlie’s  fiance. 

Often  when  one  is  using  more  elaborate  predicates  drawn  from  natural 

KThat  is,  the  definition  is  given  in  such  a  way  that  complex  cases  of  the  class  being 
defined  are  defined  in  terms  of  simpler  cases  of  the  same  class.  Recursive  definitions  thus 
iften  look  circular,  but  they  are  not,  as  they  always  begin  with  well-grounded  initial  cases 
not  defined  in  terms  of  other  members  of  the  class  being  defined. 
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language,  e.g.,  if  we  had  used  LOVES  instead  of  L  in  the  previous  example,  it 
is  more  readable  to  use  parentheses  around  the  terms  in  atomic  formulas  that 
use  the  predicate  and  separate  them  by  commas,  e.g.,  LOVES (b,  x)  instead 
of  LOVES bx.  Thus,  more  generally,  any  atomic  formula  Pti  ...t„  can  be 
written  also  as  P(ti,. . .  ,t„).  Furthermore,  atomic  formulas  involving  some 
familiar  two-place  predicates  like  and  a  few  others  that  will  be  introduced 
below,  are  more  often  written  using  infix  rather  than  prefix  notation.  For 
example,  we  usually  express  that  a  is  identical  to  b  by  writing  o  «  b  rather 
than  %  ah.  Thus,  we  stipulate  that  formulas  of  the  form  Ptt'  can  also  be 
written  as  tPt'. 

Now  we  begin  introducing  the  logical  symbols  that  allow  us  to  build  up 
more  complex  formulas.  Intuitively,  the  symbol  ->  expresses  negation;  i.e., 
it  stands  for  the  phrase  “it  is  not  the  case  that.”  Since  we  can  negate  any 
declarative  sentence  by  attaching  this  phrase  to  the  front  of  it,  we  have  the 
corresponding  rule  in  our  formal  grammar  that  if  tp  is  any  formula,  then  so  is 
->yj.  The  symbols  A,  V,  D,  and  =  stand  roughly  for  “and,”  “or,”  “if.. .then,” 
and  “if  and  only  if,”  which  are  also  (among  other  things)  operators  that  form 
new  sentences  out  of  old  in  the  obvious  ways.  Unlike  negation,  though,  each 
takes  two  sentences  and  forms  a  new  sentence  from  them.  Thus,  we  have 
the  corresponding  rule  that  if  <p  and  rp  are  any  two  formulas  of  our  language, 
then  so  are  (<p  A  V>),  (< f  V  -0)>  {<P  D  “id  (v5  =  VO* 

Finally,  we  turn  to  the  quantifiers  3  and  V.  Recall  that  we  introduced 
variables  without  explanation  above.  Intuitively,  3  and  V  stand  for  “some” 
and  “every,”  respectively;  the  job  of  the  variables  is  to  enable  them  to  play 
this  role  in  our  formal  language.  Consider  the  difference  between  “Annie  is 
happy,”  “Some  individual  is  happy,”  and  “Every  individual  is  happy.”  In 
the  first  case,  a  specific  individual  is  picked  out  by  the  name  “Annie”  and 
the  property  of  being  happy  is  predicated  of  her.  In  the  second,  all  that  is 
stated  is  that  some  unspecified  individual  or  other  has  this  property.  And  in 
the  third,  it  is  stated  that  every  individual,  whether  specifiable  or  not,  has 
this  property.  This  lack  of  specificity  in  the  latter  two  cases  can  be  made 
explicit  by  rephrasing  them  like  this:  for  some  (resp.,  every)  individual  x, 
x  is  happy.  Since  the  rule  for  building  atomic  formulas  counted  variables 
among  the  terms,  we  have  the  means  for  representing  these  paraphrases.  Let 
H  abbreviate  “is  happy”  once  again;  then  we  can  represent  the  paraphrases 
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as  3 xHx  and  'ixHx  respectively. 

Accordingly,  we  add  the  final  rule  to  our  grammar:  if  p  is  any  formula  of 
our  language  and  x  is  any  variable,  then  3xy>  and  'ixp  are  formulas  as  well. 
In  such  a  case  we  say  that  the  variable  x  is  bound  by  the  quantifier  3  (resp., 
V),  and  we  say  that  the  formula  p  is  the  scope  of  the  quantifer  3  in  3xy>,  and 
it  is  the  scope  of  the  quantifier  V  in  Vxyj. 

3  First-order  Semantics 

3.1  Structures  and  Interpretations 

We  have  motivated  the  construction  of  our  grammar  by  referring  to  the 
intended  meanings  of  the  logical  symbols  and  by  letting  our  constants  and 
variables  abbreviate  meaningful  expressions  out  of  ordinary  language.  But 
from  a  purely  formal  point  of  view,  all  we  have  in  a  language  is  uninterpreted 
syntax;  we  have  not  described  in  any  formal  way  how  to  assign  meaning  to 
the  elements  of  a  first-order  language.  We  will  do  so  now. 

A  structure  for  a  first-order  language  £  consists  simply  of  two  elements: 
a  set  D  called  the  domain  of  the  structure,  and  a  function  T  known  as 
an  interpretation  function  for  £.  Intuitively,  V  is  the  set  of  things  one  is 
describing  with  the  resources  of  £,  e.g.,  the  natural  numbers,  major  league 
baseball  teams,  the  people  and  objects  that  make  up  an  air  force  base,  or  the 
records  inside  a  database.  The  purpose  of  T  is  to  fix  the  meanings  of  the 
basic  elements  of  £  in  terms  of  objects  in  or  constructed  from  V. 

3.1.1  Interpretations  of  Constants  and  Function  Symbols 

The  interpretation  function  works  like  this.  First  we  deal  with  terms.  We 
begin  by  noting  that  variables  will  not  receive  an  interpretation,  since  their 
meanings  can  vary  (they  are  variables  after  all)  within  a  structure.  They  will 
be  treated  with  their  own  special  semantic  apparatus  below.  Constants  on 
the  other  hand,  being  the  formal  analogues  of  names  with  fixed  meanings,  axe 
assigned  members  of  V  once  and  for  all  as  their  interpretation;  in  symbols, 
for  all  constants  k  of  £,  T(k)  €  V. 
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To  deal  with  terms  formed  from  function  symbols,  we  need  first  to  in¬ 
terpret  the  function  symbols  themselves.  To  begin  with,  each  basic  function 
symbol  a  is  assigned  a  function  lF(a)  from  V  into  V.  As  indicated  above, 
the  functions  expressed  in  ordinary  language  are  often  partial •  that  is,  they 
are  often  not  defined  everywhere.  For  example,  the  function  expressed  by 
“The  salary  of”  is  not  defined  when  applied  to  a  conveyer  belt  or  a  garden 
vegetable.  This  suggests  that  we  ought  to  let  ine  functions  from  V  into  V 
that  interpret  our  function  symbols  be  partial.  This  leads  to  certain  inel¬ 
egancies  in  our  formal  apparatus,  however,  so  we  opt  instead  to  include  a 
distinguished  object  ±  in  our  domain  V  whose  sole  purpose  is  to  be  the 
value  of  functions  applied  to  objects  on  which  they  are  intuitively  undefined. 
Thus,  if  we  have  a  function  symbol  /  abbreviating  “The  salary  of,”  and  if  our 
domain  T>  contains  both  persons  and  conveyer  belts,  then  the  interpretation 
of  /  will  be  the  function  that  takes  each  person  to  his  or  her  salary  in  dollars, 
and  every  other  kind  of  object  to  our  distinguished  object  _]_.  Formally,  then, 
for  all  basic  ra-place  function  symbols  a  of  £,  T(a)  £  |  $  :  TT1  — +  T>); 

that  is,  the  interpretation  of  a  basic  n-place  function  symbol  a  of  £  is  going 
to  be  an  element  of  the  set  of  all  n-place  functions  from  the  set  of  n-tuples 
of  the  domain  V  into  V. 

Now  we  need  to  address  the  nonbasic  function  symbols,  i.e.,  those  of  the 
form  a  •  /9  which  correspond  to  nested  functional  expressions  in  ordinary 
language  like  “The  salary  of  the  father  of.”  Intuitively,  we  want  T{a  •  /?)  to 
be  the  composition  of  T{(1)  with  ^(a),  i.e.,  T{a)  o  .F(/?),  where  in  general 
(<£  o  'f'X*)  =  $('i'(*))6 — in  terms  of  our  example,  the  composition  of  the 
function  expressed  by  “The  salary  of”  with  the  function  expressed  by  “the 
father  of.”  Notice  that  by  our  trick  with  J_,  the  composition  of  any  two 
functions  will  always  be  total. 

3.1.2  Interpretations  of  Predicates 

Finally,  for  any  one-place  predicate  P,  we  let  1F(P)  be  a  subset  of  T> — 
intuitively,  the  set  of  things  that  have  the  property  expressed  by  P.  And 
for  any  n-place  predicate  R,  n  >  1,  we  let  .F(R)  be  a  set  of  n-tuples  of  de¬ 
mote  that  o  is  a  metalinguistic  symbol  of  our  extended  English  that  expresses  the 
meaning  of  our  object  language  symbol  •,  viz.,  the  composition  function. 
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ments  of  — intuitively,  the  set  of  n-tuples  of  objects  in  T>  that  stand  in  the 
relation  expressed  by  R.  Thus,  for  example,  if  we  want  L  to  abbreviate  the 
verb  “loves,”  then  if  our  domain  V  consists  of  the  population  of  Texas,  then 
J-(L)  will  be  the  set  of  all  pairs  (a, b)  such  that  a  loves  b.  Formally,  then,  for 
all  n-place  predicates  P,  T(P)  C  £>n.7 

If  one  wishes  to  include  the  identity  predicate  ss  in  one’s  language,  and 
have  it  carry  its  intended  meaning,  then  one  needs  an  additional,  more  spe¬ 
cific  semantical  rule  designed  to  do  this.  Identity,  of  course,  is  a  relation  that 
holds  between  any  object  and  itself,  but  not  between  itself  and  any  other 
object.  This  additional  semantical  constraint  is  easy  to  express  formally: 
if  our  language  £  contains  then  the  interpretation  of  %  is  the  set  of  all 
pairs  (o,  o)  such  that  o  is  an  element  of  the  domain  X>,  i.e.,  more  formally, 

^(~)  =  {(0,0)  |  o  €  D}. 

3.2  TVuth 

3.2.1  Variable  Assignments 

Given  a  structure  M  =  (V,T)  for  £  (cf.  the  definition  at  the  beginning  of 
Section  3.1)  we  can  define  what  it  is  for  a  formula  of  £  to  be  true  in  M. 
As  usual,  this  is  done  recursively.  First  we  need  to  introduce  the  notion  of 
an  assignment  a  for  the  variables,  which  is  a  sort  of  addendum  to  our  inter¬ 
pretation  function:  it  assigns  members  of  the  domain  to  variables.  Relative 
to  an  assignment  function  a,  we  can  define  the  interpretation  of  a  complex 
term  f(ti ,. . .  ,tn),  for  any  function  symbol  f  and  any  terms  tj,...,tn.  An 
interpretation  function  J-  alone  does  not  suffice  for  this  since  complex  func¬ 
tional  terms  might  contain  variables,  e.g.,  the  term  /(*),  which  are  ignored 
by  interpretation  functions.  But  if  we  supplement  J-  with  an  assigment  a 
for  the  variables,  then  we  have  something  for  the  function  T(f)  to  work  on. 
Specifically,  the  interpretation  of  the  term  f{x)  under  a,  Ta (/(*)),  is  just 
the  function  T(f)  applied  to  a(x),  the  value  assigned  to  *  by  a. 

7 Where  Z>1  =  X>,  and  2?n+l  =VnxV\  i.e.,  Z>'  is  just  V  itself,  is  the  set  of  all  pairs 
of  members  (i.e.,  the  Cartesian  product  V  x  V)  of  V,  T>*  the  set  of  all  triples  of  members 
of  V,  and  in  general  Z?"  is  the  set  of  all  n-tuples  of  members  of  V. 
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In  general,  then,  let  J-a  be  the  result  of  adding  a  to  Then  the  interpre¬ 
tation  (ti , . ..  ,tn)  of  a  complex  term  f(ti,...  ,tn)  under  a  is  simply  the 
result  of  applying  the  function  Fa( f)  (which  is  just  .F(f),  since  f  is  a  function 
symbol)  to  the  objects  ^(tr), . . . , :Fa(tn),  i.e.,  ^ (f (tj ), . . .  ,:F0(tn)). 

3.2.2  Truth  Under  an  Assignment 

Atomic  Formulas  Our  goal  in  this  section  is  to  define  the  notion  of  a 
formula  being  true  in  a  structure  M.  To  do  so,  we  will  first  define  a  closely 
related  notion,  viz.,  that  of  truth  under  an  assignment  a.  For  convenience, 
we  will  sometimes  speak  of  a  formula  being  “truea  in  M”  instead  of  being 
“true  in  M  under  a.”  We  start  by  defining  this  notion  for  atomic  formulas. 
So  let  be  an  atomic  formula  Ptx .. .  t„.  Then  <p  is  truea  in  M  just  in  case 
(•^o(ti), ••• , a (t„ ))  €  ^q(P).  Intuitively,  then,  where  n  =  1,  Pt  is  trueQ  in 
M  just  in  case  the  object  in  V  that  t  denotes  is  in  the  set  of  things  that  have 
the  property  expressed  by  P.  And  for  n  >  1,  Ptj  ...t„  is  trueQ  just  in  case 
the  n-tuple  of  objects  (oi,. . .  ,on)  denoted  by  t*,. . .  ,tn  respectively  is  in  the 
set  of  n-tuples  whose  members  stand  in  the  relation  expressed  by  P,  i.e.,  just 
in  case  those  objects  stand  in  that  relation. 

Let  us  actually  construct  a  small  language  £*  and  build  a  small  structure 
M*  to  illustrate  these  ideas.  Suppose  we  have  four  names  a,  b,  c,  d,  a 
single  function  symbol  h  (intuitively,  to  abbreviate  “the  husband  of’),  a  one- 
place  predicate  H  (intuitively,  to  abbreviate  “is  happy”),  and  a  three-place 
predicate  T  (intuitively,  to  abbreviate  “is  talking  to  ...about”).  Let  us  also 
include  the  distinguished  predicate  ~,  though  we  will  make  no  real  use  of  it 
until  later.  We  will  use  x,  y,  and  z  for  our  variables. 

For  our  structure  M*,  we  will  take  our  domain  V  to  be  a  set  of  three 
individuals,  {Beth,  Charlie,  Di},  and  our  interpretation  function  Q  will  be 
defined  as  follows.  For  our  constants,  Q(a)  =  Q(b)  =  Beth,  Q(c)  =  Charlie, 
and  Q{d)  =  Di.  (Beth  thus  has  two  names  in  our  language;  this  is  to  illustrate 
a  point  to  be  made  several  sections  hence.)  For  our  function  symbol  h, 
we  let  (?(/i)(Beth)  =  (?(/i)(Charlie)  =  _L  (so  that  Q{h)  is  “undefined”  on 
Beth  and  Charlie),  and  Q{h){ Di)  =  Charlie.  For  our  predicates  H  and  T, 

8I.e.,  if  (  is  a  constant,  function  symbol,  or  predicate,  ^a($)  =  ^(0>  if  $  is  a 
variable,  then  Ta(£)  =  <*(£)• 
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we  let  G{H)  =  {Beth,  Di}  (so,  intuitively,  Beth  and  Di  are  happy),  and 
Q(T)  =  {(Beth,  Di,  Charlie),  (Charlie,  Charlie,  Di)}  (so,  intuitively,  Beth 
is  talking  to  Di  about  Charlie,  and  Charlie  is  talking  to  himself  about  Di). 
Following  the  rule  for  ~,  we  let  (7(~)  =  {(Beth,  Beth),  (Charlie,  Charlie),  (Di, 
Di)}.  Finally,  for  our  assignment  function  /?,  let  us  let  0(x)  =  0(y)  =  Charlie, 
and  0(z)  =  Di. 

Let  us  now  check  that  Hd  and  Tbdh(z)  are  true  in  M*  tinder  0.  In  the  first 
case,  by  the  above,  Hd  is  true^  in  M*  just  in  case  Gg{d)  £  Gg(H),  i.e.,  just 
in  case  Di  is  an  element  of  the  set  {Beth,  Di},  which  she  is.  So  Hd  is  true^  in 
M*.  Similarly,  Tbdh(z)  is  true^  in  M*  just  in  case  (Gp{b),Qg{d),Gp(h(z)))  € 
Gg{T),  i.e.,  just  in  case  (Q(b),Q(d),G{h)(0(z)))  €  G{T),  i.e.,  just  in  case 
(Beth,  Di,  G(h)( Di)  (E  {(Beth,  Di,  Charlie),  (Charlie,  Charlie,  Di)}  i.e.,  just 
in  case  (Beth,  Di,  Charlie)  6  {(Beth,  Di,  Charlie),  (Charlie,  Charlie,  Di)}. 
Since  this  obviously  holds,  the  formula  Tbdh(z)  is  true^  in  M*. 

A  formula  is  falsea  in  a  structure  M,  of  course,  just  in  case  it  is  not  truea 
in  M.  It  is  easy  to  verify  that,  for  example,  Hh(b),  Hx,  and  Tdbc  are  all 
false^  in  M*  under  0. 

Conjunctions,  Negations,  etc.  Now  for  the  more  complex  cases.  Sup¬ 
pose  first  that  <p  is  a  formula  of  the  form  ~*tp.  Then  <p  is  trueQ  in  a  structure 
M  just  in  case  ip  is  not  truea  in  M.  In  so  defining  truth  for  negated  formulas 
we  ensure  that  the  symbol  ->  means  what  we  have  intended.  Things  are  much 
the  same  for  the  other  symbols.  Thus,  suppose  <p  is  a  formula  of  the  form 
ip  A  6.  Then  is  truea  in  M  just  in  case  both  ip  and  0  are.  If  <p  is  a  formula 
of  the  form  ipV  8,  then  is  true0  in  M  just  in  case  either  ip  or  8  is.  If  <p  is  a 
formula  of  the  form  ip  D  6,  then  is  truea  in  M  just  in  case  either  ip  is  false 
in  M  or  8  is  truett  in  M.  And  if  <p  is  a  formula  of  the  form  ip  =  6,  then  <p  is 
truea  in  M  just  in  case  ip  and  8  have  the  same  truth  value  in  M. 

The  reader  should  test  his  or  her  comprehension  of  these  rules  by  verifying 
that  ->Hh(b)  and  ( Tbdh{z )  A  Tccy )  D  Hd  are  both  true  in  M*  under  0. 

Quantified  Formulas  Last,  we  turn  to  quantified  formulas.  When  we  in¬ 
troduced  the  quantifiers  above,  we  noted  that  “Some  individual  is  happy,” 
i.e.,  3 xHx,  can  be  paraphrased  as  “for  some  value  of  the  variable  ix,'  the 
expression  ‘®  is  happy’  is  true.”  This  is  essentially  what  our  formal  seman- 
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tics  for  existentially  quantified  formulas  will  come  to.  To  anticipate  things  a 
bit,  BxHx  will  be  true  in  a  structure  M  under  a ,  roughly,  just  in  case  the 
unquantified  formula  Hx  is  true  in  M  under  some  (in  general,  new)  assign¬ 
ment  a'  such  that  a?(x)  is  in  the  interpretation  of  H.  It  is  easy  to  verify  that 
this  formula  is  true  in  our  little  structure  M*  under  0,  when  we  look  at  a 
new  assignment  function  0'  that  assigns  either  Beth  or  Di  to  the  variable  x. 
Thus,  3 xHx  should  come  out  true  in  M*  under  0. 

But  we  have  to  be  a  little  more  careful,  because  some  formulas — Tciz,  for 
example — contain  more  than  one  unquantified  variable.  Thus,  when  we  are 
evaluating  a  quantification  of  such  a  formula — BzTcxz,  say — we  have  to  be 
sure  that  the  new  assignment  function  a'  does  not  change  the  value  of  any  of 
the  unquantified  variables — in  this  case,  the  variable  x.  Otherwise  we  could 
change  the  sense  of  the  unquantified  formula  in  mid-evaluation.  Under  the 
assignment  function  0  above,  3 zTcxz  intuitively  says  that  Charlie  is  talking 
to  himself  about  someone  (recall  that  0{x)  =  Charlie),  and  this  should  turn 
out  to  be  true,?  in  M*  since  Charlie  is  talking  to  himself  about  Di,  i.e., 
(Charlie,  Charlie,  Di)  £  Gtt{T).  But  suppose  all  we  require  is  that  there  be 
some  new  assignment  function  0'  such  that  0'(z)  is  Di.  Then  it  could  turn 
out  also  that  0'{x)  is  Beth.  But  then  the  formula  Tcxz  would  not  be  true 
in  M*  under  0 ,  since  Charlie  is  not  talking  to  Beth  about  Di,  i.e.,  (Charlie, 
Beth,  Di)  £  Gp(T),  and  hence  we  would  not  be  able  to  count  BzTcxz  as  true 
in  M*  under  0  after  all  as  we  should  like. 

All  that  is  needed  is  a  simple  and  obvious  restriction:  when  evaluating 
the  formula  3zTcxz,  the  new  assignment  function  that  we  use  to  evaluate 
Tcxz  must  not  be  allowed  to  differ  from  0  on  any  variable  except  z  (and  even 
then  it  needn’t  differ  from  0\  in  which  case  it  is  0).  More  generally,  we  put 
the  matter  like  this:  if  <p  is  an  existentially  quantified  formula  3xV>,  then  p 
is  true  in  a  structure  M  under  a  just  in  case  there  is  an  assignment  function 
a'  just  like  a  except  perhaps  in  what  it  assigns  to  x  such  that  the  formula  ij) 
is  true  in  M  under  a'.  If  y?  is  a  universally  quantified  formula  VxV>,  then  p 
is  true  in  M  under  a  just  in  case  for  every  assignment  function  a'  just  like 
a  except  perhaps  in  what  it  assigns  to  x  the  formula  V1  is  true  in  M  under 
a'.  That  is,  in  essence,  p  is  true  in  M  just  in  case  is  true  in  M  no  matter 
what  value  in  the  domain  we  assign  to  x  (while  keeping  all  other  variable 
assignments  fixed). 
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The  reader  can  once  again  test  his  or  her  comprehension  by  showing  in 
detail  that  3xTxbh{z)  is  false  in  M*  under  /?  and  that  Vx(/?x  VTfedx)  is  true 
in  M*  under  0. 

3.2.3  Truth 

Now,  finally,  we  can  define  a  formula  to  be  true  in  a  structure  M  stmpliciter 
just  in  case  it  is  truea  in  M  for  all  assignments  a,  and  false  in  M  just  in 
case  it  is  falsea  in  M  for  all  a.  Note,  on  this  definition,  that  for  most  any 
interpretation,  there  will  be  formulas  that  are  neither  true  nor  false  in  the 
interpretation.  Our  example  3 zTbxz  above,  for  instance,  is  neither  true  nor 
false  in  M*,  since  there  are  assignments  a  on  which  it  comes  out  truea — all 
those  on  which  a(x)  =  Di — and  assignments  a  on  which  it  comes  out  falsea — 
all  those  on  which  a(x)  ^  Di.  Such  formulas  will  always  have  free  variables, 
since  it  is  the  semantic  indeterminacy  of  such  variables  that  is  responsible 
for  this  fact.  However,  note  that  some  formulas  with  free  variables  will  be 
true  or  false  in  some  models,  though  these  will  typically  be  logical  truths 
(or  falsehoods)  like  Hx  A  ~'H x,  i.e.,  formulas  which  are  not  capable  of  true 
(rest).,  false)  interpretation. 

4  Logic 

4.1  Propositional  Logic 

Now  that  we  have  the  notion  of  a  first-order  language  and  its  semantics,  we 
want  to  capture  the  meanings  of  the  logical  constants  A,  V,  D,  =,  V, 
and  3  as  explicated  in  the  semantics.  We  will  do  this  in  the  usual  way  by 
developing  a  rigorous  and  precise  logic.  A  logic,  in  the  sense  relevant  here, 
is  a  systematic  characterization  of  correct  principles  of  reasoning  with  re¬ 
spect  to  a  given  cluster  of  concepts.  The  concepts  here  are  those  expressed 
by  the  logical  constants  above,  corresponding  roughly,  once  again,  to  the 
ordinary  language  concepts  of  negation  {not,  or  it  is  not  the  case  that),  con¬ 
junction  {and),  disjunction  (or),  material  implication  (t f  ...then),  material 
equivalence  {if  and  only  if),  existential  quantification  (some),  and  universal 
quantification  {every,  or  all).  The  form  such  a  system  takes  usually  consists 
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of  two  components:  axioms  and  rules  of  inference.  We  start  with  the  axioms 
for  the  propositional  connectives. 

4.1.1  Axioms  for  Propositional  Connectives 

The  axioms  for  the  propositional  connectives  -S  A,  V,  D,  and  =  constitute 
the  basis  of  propositional  logic  and  can  be  thought  of  as  characterizing  their 
meanings.  There  are  many  equivalent  axiom atizations  for  propositional  logic, 
but  the  following,  which  makes  use  of  the  notion  of  an  axiom  schema,  is  one 
of  the  easiest.  An  axiom  schema  is  not  itself  an  axiom,  but  rather  a  sort  of 
template,  a  general  form  any  instance  of  which  is  an  axiom.  Axiom  schemas 
are  thus  not  themselves  actually  part  of  the  language.  Thus,  where  ip,  ip, 
and  0  are  any  formulas,  any  instance  of  any  of  the  following  schemas  is  an 
axiom: 

A1  ip  D  (ip  D  ip) 

A2  (ip  D  (ip  D  0))  D  ((ip  D  ip)  D  (<p  D  0)) 

A3  (ip  D  ip)  D  ((-i<p  Dip)  Dip) 

In  English,  A1  says  essentially  that  if  a  sentence  ip  is  true,  then  for  any 
other  sentence  ip,  if  ip  is  true  then  ip  is  still  true.  A2  says  that  if  a  sentence  <p 
implies  that  if  ip  is  true  then  so  is  0,  then  if  ip  implies  ip,  then  it  also  implies 
0.  Finally,  A3  says  essentially  that  if  a  sentence  ip  implies  another  sentence 
ip,  then  if  ip  is  also  implied  by  the  negation  of  ip,  then  ip  is  true  no  matter 
what  (since  either  ip  or  its  negation  is  true  no  matter  what).  These  axioms 
seem  trivial.  However,  like  the  elementary  truths  of  arithmetic  or  geometry 
that  are  second  nature  to  us  now,  they  must  be  explicitly  stated  as  a  basis 
for  deriving  other,  less  obvious  truths;  they  cannot  be  conjured  out  of  thin 
air. 

Notice  that  axiom  schemas  only  use  the  two  connectives  and  D-  Even 
though  we  have  been  using  the  other  propositional  connectives  all  along, 
officially  we  will  consider  these  to  be  our  two  “primitive”  connectives;  the 
others  can  be  defined  in  terms  of  them  as  follows  (where  the  symbol  =# 
means  “is  defined  as”): 
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Def  1:  (<p  V  VO  =4/  D  V’) 

Def  2:  (y?  A  V1)  =<#  V  -,V') 

Def  3:  (tp  =  V>)  =<tf  (y?  D  VO  A  (V  D  y>) 


The  reader  can  again  test  comprehension  by  showing  that,  no  matter  what 
truth  values  are  assigned  to  y?,  rfr,  and  0,  the  two  sides  of  each  definition 
will  always  have  the  same  truth  values  when  evaluated  in  accord  with  the 
semantical  rules  given  above  for  the  connectives  in  Section  3.2.2. 

4.1.2  Rules  of  Inference:  Modus  Ponens 

A  logic  is  not  much  good  without  rules  of  inference,  which  are  rules  that 
allow  us  to  move  from  statements  that  we  know  or  assume  to  be  true  at  the 
outset  (e.g.,  our  axioms),  to  new  statements  that  follow  logically  from  them 
(called  theorems).  Without  them,  all  we  could  do  is  write  down  axioms;  there 
would  be  no  way  to  infer  new  truths  from  those  already  given.  There  is  only 
one  rule  of  inference  in  propositional  logic: 

Modus  Ponens  (MP):  If  the  formulas  y?  and  y?  D  V'  follow  from  the  axioms 
of  propositional  logic,  then  we  may  infer  that  V  does  as  well.9 

As  a  simple  example  using  our  language  C* ,  consider  the  following  proof 
of  Hd  D  Hd,  i.e.,  the  statement  If  Di  is  happy,  then  Di  is  happy.  Note  that, 
trivial  as  it  is,  Hd  D  Hd  is  not  an  instance  of  an  axiom  schema,  and  hence 
if  it  is  to  be  a  theorem  of  our  system,  it  must  be  derivable  from  the  axioms 
using  our  rule  of  inference  MP.  This  is  in  fact  the  case.  As  an  instance  of 
Al,  we  have 

Hd  D  ((Hd  D  Hd)  D  Hd). 

As  an  instance  of  A2  we  have 

{Hd  D  {{Hd  D  Hd)  D  Hd))  D  {{Hd  D  {Hd  D  Hd))  D  {Hd  D  Hd)). 

0 Given  this,  the  notion  of  theoremhood  can  be  defined  precisely  as  follows.  A  formula 
ip  is  a  theorem  of  propositional  logic  if  and  only  if  there  b  a  sequence  <pi , . . . ,  ip„  such  that 
ipn  is  ip  and  each  <p,  b  either  an  axiom  or  follows  from  previous  lines  by  MP,  that  b,  there 
ore  previous  formulas  ip},<pki  },k  <  *.  such  that  ipn  b  <p}  D  <p,.  We  can  also  define  the 
notion  of  a  formula  4>  following  from  a  set  of  formulas  T  in  the  same  way  except  by  adding 
in  addition  that  fa  in  the  above  definition  could  also  be  a  member  of  T. 
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By  MP,  it  follows  from  these  two  statements  that 

(. Hd  D  ( Hd  D  Hd ))  D  ( Hd  D  Hd). 

But 

(Hd  D  (Hd  D  Hd)) 

is  an  instance  of  A1  again,  hence  by  MP  once  more  we  can  infer  Hd  D  Hd 
from  the  latter  two  statements. 

There  are  many  equivalent  systems  of  propositional  logic  that  are  more 
streamlined  and  computationally  more  efficient  than  the  basic  system  here; 
but  this  is  the  foundation  ou  which  they  are  all  built  and  illustrates  well 
enough  how  the  process  of  deduction  works. 

4.2  Predicate  Logic 

4.2.1  Axioms  for  the  Quantifiers 

When  we  add  axioms  for  the  quantifiers  to  propositional  logic,  we  have  full 
predicate  logic,  also  known  as  first-order  logic  and  quantification  theory.  The 
quantifiers  are  interdefinable,  so  we  only  need  to  take  one  of  them  as  prim¬ 
itive.  The  axioms  for  predicate  logic  are  usually  stated  in  terms  of  the  uni¬ 
versal  quantifier  V,  so  we  will  take  that  as  our  primitive,  and  shall  define  3 
as  follows: 

Def  4:  3xy?  =#  -'Vx-’<£>. 

That  this  definition  is  correct  is  dee’-  on  a  moment’s  reflection.  So,  for 
example,  there  exists  an  x  such  th  t  c  is  happy,  i.e. ,  someone  is  happy,  just 
in  case  it  is  not  that  case  that  for  all  i,  x  is  not  happy,  i.e.,  just  in  case  not 
everyone  is  unhappy. 

We  can  now  state  three  new  quantificational  axiom  schemas.  For  any 
formula  (p  and  term  t,  we  let  $  stand  for  the  result  of  substituting  all 
unbound  occurrences  of  x  in  <p  with  t.  Then  any  instance  of  the  following  is 
an  axiom: 

>4  Vxy?  D  <p*,  so  long  as  t  does  not  contain,  and  is  not  itself,  a  variabV  that 
becomes  bound  in  <p*. 
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A5  Vx(y?  D  if>)  D  (Vxy?  D  Vx^)- 

A6  tp  D  Vxy>,  where  x  does  not  occur  unbound  in  < p. 

The  intuitive  idea  behind  these  axioms  is  straightforward.  A4  simply  says 
that  if  something  is  true  of  everything  in  general,  it  is  true  in  particular  of 
anything  we  can  name.  Thus,  for  example,  'ix(NUM(x )  D  3 y(y  =  x  +  1))  D 
(NUM{  24)  D  3 y(y  =  24  +  1));  i.e.,  if  for  every  number  there  is  a  number  one 
greater  than  it,  then  in  particular  there  is  a  number  one  greater  than  24. 

Reverting  to  our  language  C*  and  its  structure  M*,  we  have  as  an  instance 
of  this  axiom  schema 

\Jx(Hx  V  3yTxxy)  D  (He  V  3t /Tccy). 

The  antecedent  here  (i.e.,  the  formula  to  the  left  of  the  D),  \/x(Hx\/3yTxxy), 
is  in  fact  true  in  M*,  i.e.,  in  M*,  everyone  is  either  happy  or  talking  to 
themselves  about  someone  in  M*.  Thus,  if  we  were  to  count  this  as  a  further 
“special”  axiom — i.e.,  a  nonlogical  piece  of  information  that  characterizes  the 
situation  in  the  specific  structure  we  are  investigating  and  which  might  well 
not  hold  in  other  structures — we  would  be  able  to  prove  (by  Modus  Ponens) 
that  (He  V  ByTccyj ,  i.e.,  that  Charlie  is  either  happy  or  talking  to  himself 
about  someone. 

The  second  schema  A5  captures  another  aspect  of  the  meaning  of  “every.” 
Consider  a  simple  example:  if  every  individual  is  such  that  if  it  is  red  then 
it  has  a  color,  then  if  in  fact  every  individual  is  red,  then  every  individual 
has  a  color.  This  is  just  an  unsymbolized  instance  of  A5,  and  illustrates  its 
validity. 

And  finally,  A6  simply  says  that  a  quantifier  does  not  affect  the  truth  of 
a  formula  p  if  the  quantifier  does  not  bind  a  variable  that  does  not  occur  in 
ip  or — what  amounts  to  the  same  thing — occurs  in  <p  but  is  bound  by  another 
quantifier.  So,  for  example,  if  it  is  true  that  Beth  is  ha/py,  Hb,  then  it  is 
also  true  for  every  value  of  x  that  Beth  is  happy,  'ixHb.  Similarly,  if  Charlie 
is  talking  to  someone  about  Di,  3 zTczd,  then  it  is  also  true  that  for  every 
value  of  x,  Charlie  is  talking  to  someone  about  Di,  Vx3 zTczd. 
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4.2.2  Rules  of  Inference:  Generalization 

The  move  to  predicate  logic  with  its  quantified  formulas  necessitates  a  further 
rule  of  inference,  one  designed  to  capture  how  we  reason  with  universal  quan¬ 
tification.  As  usual,  the  idea  is  best  illustrated  by  an  example.  Suppose  you 
wanted  to  prove  something  about  all  prime  numbers,  for  example,  that  for 
every  prime  there  is  a  greater  prime.  You  might  begin  by  saying  something 
like  “Let  p  be  an  arbitrary  prime  number.”  You  might  even  pick  a  specific 
prime,  for  example,  17.  Then,  by  appealing  to  none  of  the  specific  properties 
of  your  chosen  prime  that  distinguish  it  from  other  primes,  e.g.,  that  it  is  less 
than  100,  or  Plato’s  favorite  number,  etc.,  you  proceed  to  prove  in  the  usual 
way  that  there  is  another  prime  greater  than  p.  You  then  conclude  that  the 
same  is  true  for  every  prime.  What  permits  you  to  do  this  is  precisely  the 
fact  that  you  did  not  appeal  to  any  properties  of  p  that  do  not  hold  for  all 
primes;  it  was,  in  a  precise  sense,  arbitrary. 

This  sort  of  example  illustrates  the  inference  rule  known  as  Generaliza¬ 
tion.  Informally,  if  you  can  prove  that  something  is  true  of  a  particular 
individual  o  without  appealing  to  anything  that  could  not  be  proved  of  ev¬ 
erything  else  in  the  domain,  then  that  same  thing  is  true  of  everything.  The 
way  we  capture  this  idea  of  not  appealing  to  anything  that  could  not  be 
proved  of  everything  else  is  by  restricting  generalization  to  formulas  whose 
proofs  contain  no  formulas  that  say  anything  about  the  object  being  gener¬ 
alized  upon.  Thus,  we  can  say  that  if  <f> f  follows  from  T  and  the  axioms  of 
predicate  logic,  and  t  does  not  occur  (free)  in  T,  then  Vx<£  follows  from  T 
and  the  axioms  of  predicate  logic.  If,  then,  t  refers  to  the  object  o,  then  the 
absence  of  t  from  the  formulas  in  T  indicates  that  they  say  nothing  about  o. 
In  fact,  we  can  actually  use  a  simpler  but  equivalent  inference  rule  that  only 
generalizes  on  variables: 

Gen  If  follows  from  the  axioms  of  predicate  logic,  then  Vx^>  does  as  well. 

We  noted  above  that  special,  or  nonlogical,  axioms  are  designed  only  to 
hold  within  a  given  structure  one  has  singled  out,  e.g.,  a  structure  that  models 
a  certain  manufacturing  or  engineering  system  one  might  be  investigating. 
A  special  axiom  thus  captures  the  “logic”  things  within  a  restricted  sphere. 
Genuine  logical  axioms,  however,  should  be  exceptionless;  a  logical  axiom 
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formulated  within  a  given  language  £  should  be  true  in  all  structures  of  £. 
When  this  property  holds  of  all  the  axioms  of  a  logical  system,  the  system  is 
said  to  be  sound.  Soundness  is  an  essential  property  of  any  logical  system, 
since  it  is  precisely  the  job  of  its  logical  axioms  to  capture  features  that  hold 
in  any  of  its  structures.  Any  axiom  that  was  not  true  in  every  structure  could 
therefore  not  rightfully  be  considered  a  logical  axiom,  and  would  have  to  be 
rejected.  It  is  straightforward  (and  a  good  exercise)  to  show  that,  for  a  given 
language  £,  any  instance  of  any  of  the  above  axiom  schemas,  and  anything 
provable  from  them,  in  fact  has  this  property.10 

The  converse  of  soundness,  that  any  formula  true  in  every  structure  fol¬ 
lows  from  the  axioms,  is  known  as  completeness,  and  is  much  harder  to 
prove.  While  its  absence  from  a  formal  system  is  perhaps  not  as  disastrous 
as  the  absence  of  soundness,  completeness  is  nonetheless  a  very  important 
and  desirable  property  for  a  formal  system  to  have,  since  it  shows  that  the 
semantics  and  the  logic  of  the  system  match  up  precisely.  It  is  provable  that 
both  propositional  and  predicate  logic  are  complete. 

4.3  Identity 

4.3.1  Identity  and  Expressive  Power 

A  very  important  concept  within  most  any  type  of  formal  system  is  that 
of  identity ,  which  we  will  express  in  our  languages  by  means  of  the  2-place 
predicate  ».n  Identity  adds  a  great  deal  of  flexibility  and  expressive  power  to 
a  language.  Identity  is  particularly  useful  in  languages  that  contain  function 
symbols,  for  with  identity  one  can  explicitly  identify  a  named  object  as  the 
value  of  a  certain  function.  For  example,  in  our  language  £*,  we  can  express 
that  Charlie  is  Di’s  husband,  c  ss  h(d). 

Second,  identity  can  be  used  to  express  the  definite  article  “the.”  When 
we  ascribe  a  property  to  something  only  identified  as  “the  (pn — that  the 
person  Charlie  is  talking  to  himself  about  is  happy,  say — we  are  implying 

10The  proof  proceeds  by  ordinary  mathematical  induction  on  the  number  of  quantifiers 
and  connectives  a  formula  contains. 

11  We  use  »:  as  our  identity  predicate  within  languages;  this  is  to  be  distinguished  from 
the  concept  of  identity  as  it  appears  in  our  metalinguistic  talk  about  languages  and  their 
structures,  which  we  have  been  expressing  with  the  more  familiar  =. 
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three  things:  (t)  that  there  is  something  that  fits  the  description  ip — that 
there  ts  someone  Charlie  is  talking  to  himself  about — (it)  that  nothing  else 
fits  it — that  Charlie  is  not  talking  to  himself  about  anyone  else — and  (t it) 
that  that  filing  has  the  property  in  question — that  the  object  of  Charlie’s 
attention  is  happy.12  All  three  of  these  components  are  easily  expressed  in 
one  formula  with  the  help  of  the  identity  predicate.  Thus,  our  example  here  is 
expressed  in  our  language  C*  as  follows:  3x(Tccx A ~>3y(T ccy A x  96  y)AHx). 
The  force  of  the  “anyone  else”  in  (it)  above  here  is  captured  by  the  negated 
identity  predicate  here  in  the  formula:  anyone  other  than ,  i.e.,  not  identical 
to,  the  person  in  question. 

Finally,  similar  techniques  can  be  employed  to  express  numerical  notions 
without  appealing  explicitly  to  numbers.  For  example,  one  can  express  that 
at  least  two  philosophers  are  wealthy  as  3x3y(PxAPyAx  96  y).  Note  that  the 
third  conjunct  here  is  necessary,  since  the  bare  statement  3x3y(PxAPy)  does 
not  imply  there  are  two  wealthy  philosophers — both  *  and  y  could  be  assigned 
the  same  unique  wealthy  philosopher  as  their  values  (convince  yourself  of 
this  by  referring  back  to  the  section  on  the  semantics  of  3).  In  a  similar 
fashion,  one  can  express  that  there  are  exactly  two  wealthy  philosophers: 
3x3y(Px  A  Py  Ax  96  y  A \/z(Pz  D  (2  %  x  Vz  %  y)),  i.e.,  in  English,  there  are 
at  least  two  wealthy  philosophers  x  and  y,  and  any  wealthy  philosopher  is 
identical  with  either  *  or  y.  Finally,  one  can  also  say  that  there  are  at  most 
two  wealthy  philosophers:  VxVyVz((.Px APyAPz)  D  (x  ~  yVx  sr  zVy  %  z)). 
Check  to  see  that  this  statement  will  be  true  if  there  are  fewer  than  three 
philosophers,  and  false  otherwise.  These  forms  axe  easily  generalizable  for 
any  finite  number. 

4-2.2  Axioms  for  Identity 

Most  systems  of  predicate  logic  include  the  notion  of  identity  among  the 
logical  constants  of  the  system.  Given  one  standard  (though  debatable)  con¬ 
ception  of  logic  as  the  study  of  the  most  general  principles  of  reasoning,  this 
seems  quite  appropriate,  since  identity  is  a  notion  that  seems  applicable  to 
most  any  domain  about  which  one  might  reason.  Irrespective  of  the  issue 

nThis  is  the  essence  of  Bertrand  Russell’s  theory  of  descriptions,  first  developed  in  his 
famous  paper  “On  Denoting,”  Mind  14  (1905). 
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of  whether  identity  is  a  logical  notion,  it  is  certainly  a  notion  one  might  of¬ 
ten  want  to  use  within  a  formal  system  that  has  been  tailored  for  a  certain 
purpose,  and  in  particular,  it  is  essential  to  our  constraint  languages.  How¬ 
ever,  the  only  way  to  ensure  that  the  identity  predicate  carries  its  intended 
meaning  within  a  given  system  is  to  build  that  meaning  into  the  system  by 
means  of  appropriate  axioms.  The  usual  axioms  for  identity  are,  as  above, 
presented  in  the  form  of  schemas,  and  are  also  straightforward: 

A7  t  a  t,  for  any  term  t 

A8  x  ss  t  D  (<p  D  <p*),  so  long  as  t  does  not  contain,  and  is  not  itself,  a 
variable  that  becomes  bound  in  • 

A7  captures  the  point  made  above,  that  identity  holds  between  any  object 
and  itself.  A8  is  nearly  as  intuitive.  The  idea  is  simply  that  if  something  is 
true  of  a  given  object,  then  it  does  not  matter  how  the  object  is  referred  to; 
it  is  still  true  of  it.13  H,  for  example,  Mark  Twain  wrote  Huckleberry  Finn , 
then  it  follows  that  Samuel  Clemens  did  as  well,  since  they  are  the  same 
person.  That  is,  more  formally,  by  A8  it  is  an  axiom  that 

mzzsD  (WROTE (m,h)  D  WROTE(s,h). 

If  again  we  add  m  %  s  as  a  special  axiom,  or  derive  it  from  other  in¬ 
formation  we  possess,  we  can  then  prove  by  MP  that  WROTE(m,h )  D 
WROTE(a,  h).  If  we  then  have  in  addition  the  further  information  that 
WROTE(m,h),  we  can  prove  by  MP  once  again  that  WROTE(s,h). 

As  a  second  example,  let  us  revert  to  our  language  C*  once  again,  in 

iaThere  are  well  known  exceptions  to  this.  For  example,  suppose  ShoTty  is  five  feet  tall, 
and  that  his  real  name  is  “Eddie.”  So  Shorty  k  Eddie.  Nonetheless,  from  the  bet  that 
Shorty  is  so-called  because  of  his  size,  it  does  not  follow  that  Eddie  is  so-called  because 
of  his  size.  Other  bmous  contexts  where  this  principle  seems  to  break  down  are  those 
involving  psychological  attitudes  like  belief.  For  example,  even  though  1  believe  that  9  is 
prime,  1  may  not,  due  to  my  rusty  calculus,  believe  that  /„  x2dx  is  prime,  despite  the  bet 
that  Jo  z2dx  k  9.  In  the  semantics  and  logic  we  are  constructing  it  is  assumed  that  we 
shall  not  be  needing  to  formalize  expressions  like  “is  so-called  because  of”  and  “believes” — 
though  it  should  be  noted  that  the  apparatus  we  have  developed  here  is  eminently  capable 
of  being  extended  to  handle  such  expressions. 
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which  we  included  the  identity  predicate.  In  that  language,  we  have  both 

a%cD  (Ha  D  He) 

and 

a^bD(HaD  Hb) 

as  instances  of  A6.  In  M*,  a  rs  c  is  false,  since  G(a)  =  Beth,  and  £/(c)  = 
Charlie.  Thus,  a  v  c  would  not  be  considered  among  any  special  axioms  we 
might  have  to  characterize  M*.  Hence,  as  we  should  hope,  we  would  not  be 
able  to  infer  Ha  D  He ,  which  is  also  false  in  M*.  However,  a  ss  b  is  true  in 
M*  (recall  that  we  assigned  both  a  and  b  to  Beth  as  their  interpretation), 
and  hence  could  be  a  special  axiom  for  the  situation  characterized  by  our 
structure.  By  MP  we  could  then  infer  from  the  second  of  the  two  instances 
above  that  Ha  D  Hb,  and  from  Ha  (which  might  be  a  further  special  axiom 
perhaps)  that  Hk. 

As  one  would  hope,  our  logic  remains  sound  and  complete  when  we  add 
the  axioms  for  identity. 

5  Constraint  Languages 

Now  that  we  have  a  well-developed  logical  foundation,  we  will  begin  to  add 
the  particular  elements  that  constitute  a  constraint  language.  In  actuality, 
there  will  be  infinitely  many  possible  constraint  languages,  since  each  set 
of  predicates  specifies  a  different  language.  However,  all  of  them  will  have 
certain  elements  in  common,  and  it  is  these  common  elements  we  want  to 
begin  laying  out  now. 

First,  every  constraint  language  will  be  a  first-order  language  as  described 
above.  Second,  we  will  assume  that  a  constraint  language  will  contain  the 
basic  resources  of  arithmetic — a  distinguished  predicate  NUM,  the  numerals, 
the  usual  function  symbols  +,  •,  and  exp ,  and  enough  axiomatic  power  to 
prove  basic  arithmetical  facts.  The  intended  semantics  for  any  constraint 
language  will  thus  always  contain  the  natural  numbers,  with  these  syntactic 
items  receiving  the  obvious  interpretations.  Third,  every  constraint  language 
will  contain  a  certain  amount  of  set  theory.  Throughout  this  discussion 
we  have  been  employing  set  theory  in  a  rough  and  ready  fashion  in  our 
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description  of  the  model  theory  for  first-order  languages.  In  a  constraint 
language,  we  will  want  to  be  able  to  do  this  in  a  principled  way. 

The  full  theory  of  sets  that  one  might  find  in  a  text  book  is  very  pow¬ 
erful  and  very  complex.  However,  the  structures  for  which  we  are  designing 
oui  constraint  languages  axe  all  relatively  simple;  indeed,  they  are  all  finite, 
though  we  shall  not  need  to  assume  this.  Furthermore,  we  will  not  need 
much  more  than  the  simplest  set  theoretic  operations  and  constructions  to 
express  what  we  want  to  express.  Hence,  all  we  need  is  enough  set  theory  to 
meet  these  limited  needs.  We  will  provide  this,  along  with  some  motivation 
and  explication  of  the  relevant  concepts,  in  the  next  section. 

5.1  Basic  Set  Theory 

5.1.1  Membership 

A  set,  intuitively,  is  just  a  collection  of  things  which  themselves  may  or  may 
not  be  sets.  Usually  we  pick  out  a  set  with  the  help  of  some  predicate,  e.g., 
the  set  of  all  prime  numbers,  or  American  citizens,  or  track  and  field  events  in 
the  1988  Olympics.  But  this  is  just  for  our  benefit;  any  collection  of  things, 
even  if  they  cannot  be  picked  out  by  a  common  property,  indeed  even  if  they 
cannot  be  picked  out  by  us  in  any  way  at  all  (as  is  the  case,  e.g.,  with  most 
infinite  collections  of  natural  numbers),  still  form  a  set.  We  will  see  shortly 
that  we  have  to  be  a  little  more  careful  than  this  about  the  sets  we  claim  to 
exist;  but  this  at  least  gets  our  intuitions  going  about  what  sorts  of  things 
sets  are. 

The  most  basic  relation  a  thing  can  bear  to  a  set  is  that  it  can  be  a 
member ,  or  element ,  of  the  set.  Thus,  the  number  17  is  a  member  of  the  set 
of  all  primes;  George  Bush  is  a  member  of  the  set  of  American  citizens;  and 
the  now  unofficial  race  in  which  Ben  Johnson  beat  Carl  Lewis  is  a  member  of 
the  set  of  track  and  field  events  that  took  place  in  the  1988  Olympics.  This 
special  relation  is  nearly  always  represented  by  the  symbol  €,  and  as  with 
all  the  two  place  set-theoretic  relations  we  will  introduce,  we  will  use  infix 
rather  than  prefix  notation.  Thus,  we  will  write  o£i  rather  than  £ab. 

Logically,  sets  are  just  individuals  like  any  others,  and  so  we  will  use 
constants  to  stand  for  them.  And  since  not  everything  is  a  set,  we  will 
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introduce  a  special  predicate  SET  to  abbreviate  “is  a  set.”  Since  it  will 
often  be  convenient  to  say  something  general  only  about  6ets,  we  will  set 
aside  the  letters  r,  s,  and  t  (again,  perhaps  with  subscripts  and  primes)  to 
serve  as  special  set  variables  that  take  only  sets  as  values  (and  as  before 
corresponding  sans  serif  characters  to  serve  as  metavariables).  This  way,  we 
will  be  able  to  say  things  about  all  and  only  sets  without  having  to  use  the 
predicate  USET ”  explicitly.  For  example,  suppose  we  want  to  say  that  the 
object  a  is  a  member  of  some  set.  Without  these  special  set  variables  we 
would  have  to  express  this  as  3x(SET(x )  A  a  £  *).  With  them,  however, 
we  can  simply  write  this  as  3s(a  €  s).  Similarly,  if  we  want  to  express  that 
every  set  is  a  member  of  some  other  set,  without  the  set  variables  we  have  to 
write  Vi (SET(x)  D  3y(SET(y)  A  x  £  y)),  whereas  with  them  we  can  simply 
write  Vr3s(r  6  s).  In  general,  and  more  abstractly,  if  s  is  any  set  variable 
that  does  not  occur  in  a  formula  <p,  then  Vx(S£T(x)  D  <p)  is  equivalent  to 
Vs and  3x(5£T(x)  A  y>)  is  equivalent  to  SsysJ  (where,  once  again,  $  is  the 
result  of  replacing  every  unbound  occurrence  of  x  in  <p  with  an  occurrence  of 
s). 

It  frequently  happens  that  we  want  to  say  something  f  about  some  or 
all  the  members  of  a  given  set  s.  In  our  current  grammar,  this  would  be 
expressed  as  3i(i  £  s  A  ip)  or  Vx(x  £  s  D  <p)  respectively.  For  convenience 
we  allow  that  these  forms  can  be  abbreviated  as  (3x  £  s)ip  and  (Vi  £  s)<p 
respectively. 

5.1.2  Basic  Set  Theoretic  Axioms 

Russell’s  Paradox  Sets  combine  and  interact  in  many  interesting  ways, 
but  for  deep  and  historically  significant  reasons,  not  every  way  in  which 
one  might  think.  For  this  reason  we  need  to  set  down  clear  principles  that 
tell  us  precisely  when  such  combinations  and  interactions  can  occur,  and 
furthermore  exactly  what  sets  exist  within  a  given  domain.  That  is,  we  need 
some  set  theoretic  axioms. 

In  case  the  reader  is  not  convinced  of  this  need,  consider  the  follow¬ 
ing  famous  paradox,  known  as  Russell’s  paradox ,  after  the  famous  philoso¬ 
pher/logician  Bertrand  Russell  who  discovered  it.  As  noted  above,  we  often 
pick  out  sets  in  ordinary  contexts  by  means  of  some  predicate  or  (more  gen- 
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erally)  description  that  holds  of  all  and  only  the  members  of  the  set:  Thus, 
for  example,  one  might  want  to  consider  the  set  of  all  Texans  over  thirty- 
five  who  drink  beer  by  means  of  the  description  “Texan  over  thirty- five  who 
drinks  beer,”  or  more  formally,  the  description  TEXAN  (x)  A  age. of  (x)  > 
35  A  DRINKS -BEER(x).  Let  us  use  the  notation  {x  |  TEXAN {x)  A 
age. of  (x)  >  35  A  DRINKS .BEER(x)}  to  name  this  set,  and  in  general  the 
notation  {x  |  tp}  to  name  the  set  of  things  that  satisfy  the  description  tp. 
Now,  intuitively,  one  would  think  that  any  such  description  tp  with  a  single 
unbound  variable  picks  out  a  corresponding  set  comprising  the  things  that  fit 
the  description.  For  after  all,  a  set  is  just  a  collection  of  things;  so  in  particu¬ 
lar  the  collection  satisfying  a  certain  description  is  a  set.  Russell  found  that, 
intuitions  to  the  contrary,  this  is  not  always  so.  Consider  the  description  “set 
that  does  not  have  itself  as  a  member,”  i.e.,  s  #  s.  (Remember  that  s  is  a  set 
variable.)  Intuitively,  there  are  all  sorts  of  sets  that  satisfy  this  description: 
the  set  of  horses  is  not  a  horse  and  hence  is  not  a  member  of  itself,  the  set 
of  solar  planets  is  not  a  planet,  and  so  on.  By  the  intuitive  principle  above, 
there  is  a  set  of  all  sets  that  satisfies  this  description,  i.e.,  there  is  the  set 
r  =  {s  |  s  a}.  But  now  ask  yourself:  is  r  a  member  of  itself  or  not?  If  it 
is,  then  since  r  is  the  set  of  all  sets  that  are  not  members  of  themselves,  it 
follows  that  it  is  not  a  member  of  itself  after  all.  If  on  the  other  hand  it  is 
not  a  member  of  itself,  then  it  satisfies  the  condition  for  membership  in  r, 
i.e.,  it  actually  is  a  member  of  itself.  Either  way  we  contradict  ourselves.  So 
there  cannot  be  such  a  set  as  r  after  all,  despite  what  our  intuitions  tell  us. 

The  Axioms  The  lesson  here  is  that  not  just  any  collection  of  things  we 
is  a  set.  Hence  the  need  for  axioms  that  do  not  get  us  into  the  same  sort 
of  trouble.  For  our  purposes,  we  need  surprisingly  few:  four  axioms  and  one 
axiom  schema.  The  first  axiom,  extensionality,  tells  us  when  two  apparent 
sets  are  in  fact  identical  viz.,  when  they  have  exactly  the  same  members: 

STl  VrVa(Vx(x  €  r  =  x  £  a)  D  r  =s  s), 

i.e.,  for  all  sets  r  and  s,  if  for  any  object  x,  x  is  a  member  of  r  if  and  only  if 
it  is  a  member  of  s,  then  r  and  s  are  the  same  set. 

The  second  axiom,  patring,  is  that  any  two  objects  (within  a  given  do- 
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main)  form  a  set: 

ST2  VxVj/3 s(s  w  {x,y}), 

where  “{x,  j/}”  is  a  name  for  the  set  that  contains  exactly  the  objects  denoted 
by  x  and  y.  (By  extensionality  there  can  be  only  one  such  set.)  Thus,  to 
make  this  proper,  we  need  to  add  to  our  vocabulary  the  left  and  right  braces 
{,},  and  to  our  grammar  the  rule  that  if  tl5...  ,tn  are  any  terms,  then  the 
expression  {tj , . . .  ,tn}  is  a  term  as  well.14 

The  next  axiom  declares  that  the  union  of  any  set  r  exists,  i.e.,  the  set 
whose  elements  are  exactly  the  members  of  the  members  of  r: 

ST3  Vr3sVi/(y  €  s  =  3 t(t  €  r  A  y  6  <)), 

in  English,  for  any  set  r  there  exists  a  set  s  such  that  for  any  object  y,  y  is 
a  member  of  s  if  and  only  if  there  is  a  set  t  such  that  t  is  a  member  of  r  and 
the  object  y  is  a  member  of  t.  For  a  given  set  r,  we  will  let  Ur  stand  for  the 
union  of  r.  (U  is  thus  a  distinguished  two-place  function  symbol,  denoting 
the  (partial)  function  that  takes  any  set  to  its  union.)  We  will  usually  write 
r  U  s  for  U(ris}- 

When  one  set  a  is  a  subset  of  another  b  (i.e.,  when  all  the  members  of  a 
are  members  of  b )  we  express  this  with  a  distinguished  predicate  C  as  a  C  b. 
The  fourth  axiom  says  that  the  set  of  all  subsets  of  any  given  set  exists: 

ST4  Vr3sVx(x  s  =  x  C  r), 

that  is,  for  any  set  r  there  is  a  set  s  such  that  for  any  object  x,  x  is  a  member 
of  s  just  in  case  x  is  a  subset  of  r.  If  a  C  b  and  a  96  6,  we  say  that  a  is  a 
proper  subset  of  6,  and  we  express  this  as  a  C  b.  For  any  given  set  o,  the 
set  of  all  its  subsets  is  called  the  power  set  of  a.  The  (partial)  function  that 
takes  each  set  to  its  power  set  will  be  denoted  by  the  distinguished  function 
symbol  potu,  and  thus  the  power  set  of  a  will  be  denoted  by  pow{a). 

14 Strictly  speaking,  we  can  think  of  ourselves  as  adding  infinitely  many  new  function 
symbols  to  our  language,  where  each  /„  is  an  n-place  function  symbol,  each  of 

which  can  by  convention  be  rewritten  using  the  brace  notation.  The  rewritten  form  of 
each  /„  is  thus  evident  by  the  fact  that  there  are  n  terms  between  the  braces,  e.g.,  {o,6,c} 
is  the  rewritten  form  of  fsabc. 
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Finally  we  come  to  our  one  set  theoretic  axiom  schema,  so-called  because 
it  actually  stands  for  infinitely  many  axioms  of  the  same  general  form,  one  for 
each  formula  of  our  language.  It  is  called  the  axiom  schema  of  separation,  or 
subsets.  The  idea  is  quite  simple:  given  a  certain  set  a  and  some  description 
<p  in  our  language,  we  can  separate  out  the  set  of  all  the  members  of  a  that 
satisfy  the  description.  Formally,  for  any  formula  ip, 

ST5V  VrBsVa:  r(x  £  j  =  y>(x)), 

where  ^>(x)  is  the  result  of  replacing  any  unbound  variable  in  ip  with  a:.16 

Russell’s  Paradox  Revisited  Given  the  separation  axiom  schema  we  are 
able  to  reintroduce  in  a  restricted  form  the  notation  for  sets  used  in  the 
brief  discussion  of  Russell’s  paradox  above.  The  paradox  arises  when  one 
assumes  one  can  generate  sets  arbitrarily  with  any  given  formula.  Separation 
allows  one  to  use  .arbitrary  formulas  only  to  form  sets  from  the  members 
of  previously  given  sets,  and  this  eliminates  the  problem;  in  this  light,  in 
Russell’s  argument,  for  any  given  set  a  already  proved  to  exist,  one  is  allowed 
to  assume  only  the  existence  of  the  set  {s  |  s  €  a  A  s  ^  s),  and  this  causes 
no  problems  at  all.  Thus,  we  can  safely  add  the  following  grammatical  rule: 
16  if  ip  is  any  formula,  t  any  term,  and  x  any  variable,  then  {x  |  x  G  t  A  <p}  is 
a  term  as  well.  Similar  to  what  we  allowed  with  certain  types  of  quantified 
formulas,  such  terms  can  also  be  written  as  {x  6  t  |  <p}. 

15  Assuming  of  course  x  does  not  become  bound  in  the  process;  if  it  does,  we  can  always 
replace  it  in  the  above  schema  with  a  new  variable  not  occurring  in  <p. 

16  Or  more  cautiously,  it  appears  that  we  can  do  so  safely  for  all  we  can  tell.  Due  to 
Godel’s  famous  second,  incompleteness  theorem,  there  is  no  way  to  prove  that  there  are 
not  other  hitherto  undiscovered  paradoxes  lurking  in  the  theory  of  sets;  that  is,  we  cannot 
prove  its  consistency  (at  least,  not  without  begging  the  question  by  proving  it  in  a  theory 
that  is  at  least  as  dubious).  The  great  success  of  the  theory  over  the  past  eighty-five  years, 
however,  and  the  absence  of  any  new  paradoxes  despite  extensive  use  and  scrutiny  of  the 
theory,  has  given  logicians  great  confidence  that  it  is  in  feet  consistent,  even  if  we  shall 
never  know  this  with  utter  certainty. 
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5.1.3  Finitude  and  the  Set  of  Natural  Numbers 

As  noted,  we  axe  assuming  the  existence  of  the  natural  numbers.  It  will  prove 
very  useful  then  to  assume  in  addition  that  they  jointly  form  a  set;  this  is 
not  provable  from  the  above  axioms.  The  easiest  way  to  do  this  is  just  to 
add  an  axiom  that  declares  this  explicitly: 

NN  3sVz(z  €  3  =  NUM( x)), 

i.e.,  there  exists  a  set  s  such  that  for  any  object  x,  x  is  an  element  of  s  if  and 
only  if  x  is  a  natural  number.  By  the  axiom  of  extensionality,  there  can  be 
only  one  such  set.  We  will  call  it  Af . 

We  are  now  able  to  define  another  useful  notion.  As  noted,  the  structures 
we  will  examine  will  be  finite.  Nonetheless,  it  will  still  be  important  to  be 
able  to  say  explicitly  that  they  are  finite,  and  hence  we  need  to  be  able  to 
express  the  concept  of  finitude.  We  can  do  this  with  the  help  of  the  set  Af. 
Specifically, 

Def  5:  FINITE(s)  3n  €  Af(s  ~  {m  €  Af  |  m  <  n}), 

where  t  ~  r  means  intuitively  that  t  and  r  are  the  same  size,  i.e.,  that  there 
is  a  one-to-one  correspondence  between  them.  (This  latter  notion  can  also  be 
defined  straightforwardly  with  the  set  theoretic  apparatus  at  our  disposal.) 
Thus,  a  set  is  finite  just  in  case  it  is  the  same  size  as  the  set  that  contains 
all  and  only  the  natural  numbers  less  than  a  given  natural  number  n.  The 
number  n  is  said  to  be  the  cardinality  of  the  set. 

5.1.4  Difference,  Intersection,  and  the  Empty  Set 

Many  interesting  and  important  facts  about  sets  are  derivable  from  the  above 
axioms.  We  will  state  two.  The  first  is  that  the  existence  of  the  difference 
a  —  b  of  two  sets  a  and  6,  i.e.,  the  set  of  dements  of  a  that  are  not  in  6.  (—  is 
thus  a  new  two-place  functions  symbol.)  It  is  easy  to  prove  that  a  —  b  exists: 
by  union,  a  U  b  exists,  and  by  separation,  there  is  an  s  that  contains  just 
those  dements  of  a  U  b  that  are  both  in  a  and  not  in  b. 

The  next  thing  we  will  prove  is  the  existence  intersection  of  any  two  sets, 
where  the  intersection  of  sets  a  and  b  is  just  the  set  of  all  objects  that  a  and 
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b  both  have  as  members.  We  will  refer  to  this  set  as  a  fl  6,  making  use  of  the 
distinguished  two-place  function  symbol  H.  The  proof  that  aflh  exists  is  also 
easy:  by  union,  U{ai&}  exists;  by  separation,  we  then  pull  out  the  set  of  all 
x  6  U{a?&}  such  that  both  x  €  a  and  x  G  b.  In  general,  we  can  show  that 
the  intersection  of  any  number  of  sets  exists  in  essentially  the  same  way. 

Notice  that  often  there  might  be  no  elements  common  to  two  sets.  None¬ 
theless,  their  intersection  is  a  perfectly  good  set:  the  empty  set.  We  can 
prove  the  existence  of  the  empty  set  a  bit  more  formally  like  this.  We  know 
there  are  sets,  since  first-order  logic  guarantees  the  existence  of  at  least  one 
object  a,  and  by  pairing  it  follows  that  the  singleton  set  {a}  exists.  By  the 
schema  of  separation,  letting  ip  be  the  formula  x  96  x  (i.e.,  ->(sr  a:)),  there 

is  a  set  s  that  contains  all  the  members  x  of  {a}  such  that  x  9^  x,  i.e.,  all  the 
members  of  {0}  that  are  not  identical  to  themselves.  But  of  course  there  are 
no  members  of  {a}  that  fit  that  description.  So  s  is  a  set  with  no  members, 
i.e.,  the  empty  set.  Following  the  usual  practice,  we  will  use  the  constant  0 
to  refer  to  this  set.  Two  sets  r  and  s  are  said  to  be  disjoint  if  they  have  no 
members  in  common,  i.e.,  if  r  U  s  =  0.  A  set  s  of  sets  is  said  to  be  pairwise 
disjoint  if  any  two  members  of  s  are  disjoint. 

5.1.5  Functions  and  Ordered  n-tuples 

This  set  theoretic  apparatus  enables  us  to  provide  an  elegant  account  of 
certain  other  important  notions.  First,  an  extremely  versatile  and  useful 
notion  is  that  of  an  ordered  pair.  An  ordered  pair  is  similar  to  a  set  of  two 
elements,  except  that  unlike  a  set,  which  is  an  unordered  collection,  there 
is  a  first  member  and  a  second  member.  Thus,  where  (a,  6)  stands  for  the 
ordered  pair  whose  first  dement  is  a  and  whose  second  element  is  6,  what  is 
important  about  ordered  pairs  is  that  they  satisfy  the  following  principle: 

OP  VxVyV  zVw((x  ,y)  ~  {z,w)  D  (x  z  A  y  ~  w)). 

That  is,  ordered  pairs  are  identical  only  if  their  first  elements  are  identical 
and  their  second  elements  are  identical,  i.e.,  only  if,  like  any  set,  they  have 
the  same  elements,  and,  unlike  sets— which  have  further  structure  beyond 
their  elements,  those  elements  occur  in  the  same  order.  The  way  we  write 
down  names  for  the  members  of  an  ordered  pair,  unlike  sets,  is  therefore 
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significant,  since  the  first  name  we  write  down  signifies  the  first  element  of 
the  pair,  and  the  second  name  the  second  element.  For  example,  whereas 
{a, 6}  %  {6, a},  we  have  in  the  case  of  ordered  pairs  that  (a, b)  96  (6, a). 

As  it  happens,  we  need  not  introduce  ordered  pairs  as  a  new  sort  of  object, 
since  with  a  little  set  theory  it  is  easy  to  define  them  as  sets  of  a  certain  sort. 
There  are  many  ways  to  do  this,  but  given  that  we  will  have  numbers  in  the 
semantics  for  all  constraint  languages,  for  our  purposes  the  easiest  way  10  pull 
this  off  is  simply  by  “marking  the  mtended  first  element  of  an  ordered  pair 
with  the  number  one,  and  the  second  with  the  number  two.  More  j  ~ecisely, 
we  define  the  ordered  pair  (a,  b)  just  to  be  the  set  {{a,  1},{6, 2}}.  It  is  easy 
to  check  that  ordered  pairs  so  defined  satisfy  the  above  principle.  More 
generally,  we  can  define  the  notion  of  an  ordered  n-tuple  in  the  same  way: 
the  n-tuple  (ai,...,o„)  is  defined  to  be  the  set  {{aj,l}, ...  ,{a„,n}}. 

Given  the  notion  of  an  ordered  n-tuple,  we  can  give  a  more  precise  account 
of  the  notion  of  a  function.  A  one-place  functi°n  /  from  one  set  r  to  another 
s  is  just  a  mapping  that  takes  each  element  a  of  r  (or  some  subset  of  r,  if 
/  is  partial)  to  an  element  b  =  f(xj  of  s.  Thus,  we  can  simply  think  of 
such  a  function  as  a  set  of  ordered  pairs  (a,  b)  where  b  is  the  element  that 
a  is  mapped  to  by  the  function  /.  More  generally,  an  n-place  function  is  a 
set  of  ordered  n  -f  1-tuples  (oj,...  ,an,a„+i )  where  0n+1  is  the  object  that 
aj,. . .  ,a„  are  mapped  to  by  the  function.  Functions  thus  turn  out  simply  to 
a  type  of  set.  The  set  of  all  one-place  functions  from  one  set  r  to  another  3 
will  be  denoted  by  rs. 

5.1.6  The  Intended  Semantics:  The  Cumulative  Hierarchy  of  Sets 

The  above  gives  a  good  idea  of  how  sets  combine  and  interact,  and  v.l.  _  *  sets 
we  can  suppose  there  to  be,  but  it  does  not  provide  much  of  an  idea  of  the  in¬ 
tended  semantics  for  set  theory  and  hence  for  constraint  languages  generally. 
The  intended  picture  of  the  structure  of  sets  within  a  given  domain  is  known 
as  the  iterative ,  or  cumulative ,  conception  of  set.  On  this  conception,  sets  are 
hierarchical;  they  come  in  levels.  The  lowerst  level  Lo  consists  of  our  initial 
set  of  vrelements,  i.e.,  tilings  that  are  not  themselves  sets:  numbers,  people, 
machines,  buildings,  strings,  database  records,  countries,  etc.  The  next  level 
Lj  consists  of  all  possible  subsets  of  L0  together  with  the  urelements,  i.e., 
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Li  =  pow (Lo)ULo-  The  next  level  L?  consists  of  all  possible  subsets  of  Lj 
together  with  all  the  elements  .  In  general,  L„+1  =  pou>(Ln)ULn.  Each 
level  is  cumulative,  i.e.,  it  pulls  up  the  elements  of  the  previous  level  to  join 
all  the  sets  that  could  be  formed  out  of  those  elements.  And  so  it  contin¬ 
ues  through  the  sequence  of  natural  numbers.  The  intended  semantics  for  a 
given  constraint  language,  sets  and  all,  is  just  the  union  of  all  these  levels, 
i  e-,  UeArLi-  17 

5.2  Constraints  Revisited 

With  the  above  apparatus  in  place,  we  can  return  to  the  notion  of  a  constraint 
and  offer  an  account  that  i;  a  little  more  precise.  It  is  orr  contention  that 
any  current  information  modeling  language,  and  most  any  language  likely  to 
appear  on  the  scene,  can  be  translated  into  a  subset  of  our  language.  There  is 
nothing  particularly  controversial  about  this  claim,  given  the  logical  strength 
of  the  language  we  have  introduced.  The  only  way  to  strengthen  it  in  any 
significant  way  would  be  to  move  to  a  full  higher-order  language  and  logic;  but 
few  if  any  concepts  that  need  to  be  expressed  in  the  domain  of  information 
modeling,  database  modeling,  and  the  like  need  anything  approaching  the 
power  of  higher-order  logic.  Thus,  our  full-strength  first-order  language  cum 
logic  cum  set  theory  should  be  all  we  need  to  express  anything  that  can  be 
expressed  in  any  extant  or  likely  modeling  language. 

The  theory  here  is  also  expressive  enough  to  define  the  intended  se¬ 
mantic  structures  that  interpret  these  modeling  languages,  and  expressive 
enough  to  define  the  model  theoretic  connections — i.e.,  the  interpretations 
functions  and  variable  assignments — between  the  languages  and  those  struc¬ 
tures.  Thus,  we  will  be  able  to  define  the  notion  of  truth  for  formulas — or 
functionally  similar  syntactic  expressions;  let  us  call  them  assertions — of  the 
language,  and  hence  we  will  be  able  to  characterize  when  a  given  semantic 
structure  is  a  realization  (in  the  sense  of  Section  3.2)  of  a  given  set  of  formu¬ 
las  or,  more  generally,  assertions.  We  sketch  an  example  of  this  in  the  next 
section. 

17Though  this  is  not  anything  we  can  say  in  the  formal  constraint  language  itself,  since 
we  can  only  use  it  to  talk  about  things  within  its  semantic  domain — failure  to  realise  this 
ever-present  semantic  limitation  is  in  fact  what  lies  behind  Russell’s  paradox. 
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Using  these  facts,  we  can  flesh  out  the  notion  of  a  constraint  more  pre¬ 
cisely.  Let  us  call  a  set  of  formulas  or  assertions  in  a  given  modeling  language 
a  diagram.  As  noted  in  the  introduction,  a  modeling  language  might  be  put 
to  two  very  different  uses:  a  descriptive,  or  de  facto ,  use,  and  a  prescriptive, 
or  de  jure,  use.  Suppose  a  modeling  language  ML  is  being  used  with  respect 
to  a  given  system  S,  and  the  modeler  develops  a  specific  diagram  D.  If  ML 
is  being  used  descriptively,  then  the  system  S  as  t<  is  should  be  capable  of 
being  understood  as  a  realization  of  D.  That  is,  if  the  diagram  D  is  a  correct 
description  of  S,  it  should  be  possible  to  consider  S  abstractly  (at  the  time  in 
question)  as  a  particular  instance  of  an  intended  semantic  structure  for  ML 
that  makes  all  the  assertions  in  D  true. 

On  the  other  hand,  if  ML  is  being  used  prescriptively,  then  it  will  not 
necessarily  be  possible  to  consider  S  as  it  is  to  be  a  realization  of  D.  This 
will  typically  be  the  case  for  the  prescriptive  use  of  ML,  since  the  function 
of  a  diagram  in  such  uses  is  to  improve  or  alter  the  existing  structure  of  the 
system  in  question.  The  system  will  fail  to  realize  the  diagram.  In  such  a 
case,  the  assertions  of  D  must  then  be  considered  not  as  descriptions  of  S,  but 
as  constraints  on  S;  they  are  assertions  that  must  be  satisfied  by  any  state  of  S 
that  is  to  be  deemed  acceptable.  The  diagram,  that  is  to  say,  is  prescriptive 
rather  than  descriptive.  The  realizations  of  D  within  the  intended  model 
theory  of  ML  can  thus  be  thought  of  as  abstract  characterizations  of  the 
acceptable  states  of  S,  the  sorts  of  states  that  S  is  permitted  to  be  in. 

In  both  cases,  then,  de  facto  and  de  jure,  D  has  realizations  (so  long  as  it 
is  not  contradictory).  Only  in  the  former  case  is  it  assumed  that  the  current 
state  of  the  system  under  scrutiny  can  itself  be  considered  a  realization  of  D. 
In  the  latter,  D  will  in  general  only  have  abstract  (i.e.,  set  theoretic)  realiza¬ 
tions  which  represent  the  acceptable  states  of  the  system.  Given  this,  then, 
a  constraint  can  be  defined  simply  to  be  an  assertion  within  a  prescriptive 
diagram. 

5.3  Information  Structures:  An  Intuitive  Account 

Now  that  we  have  all  this  apparatus  at  our  disposal,  it  should  be  put  to  good 
use.  We  will  demonstrate  the  power  of  the  apparatus  as  well  as  some  of  the 
ideas  and  claims  mentioned  above  by  using  a  constraint  language  to  define  a 
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general  type  of  set  theoretic  structure  suggested  by  the  information  modeling 
technique  IDEF1.  (An  overview  of  EDEFl  is  found  in  Appendix  A.)  These 
structures  are  similar  to  the  entity-relationship-attribute  structures  defined 
by  Chen  in  his  seminal  1976  paper  (5],  though  we  make  explicit  the  dement  of 
mtensionality  in  such  structures  (see  below).  Despite  their  relative  simplicity, 
we  have  found  these  structures  to  be  very  powerful  and  flexible  mathematical 
tools  for  characterizing  many  different  types  of  information- bearing  systems. 
Consequently,  for  purposes  here  we  will  call  them  information  structures.  In 
this  section  we  will  develop  an  informal  picture  of  these  structures  using  our 
apparatus.  A  more  formal  treatment  is  found  in  Appendix  B. 

An  information  structure  consists  of  four  different  types  of  objects:  en¬ 
tity  classes ,  attribute  value  classes,  attributes,  and  links.  Entity  classes,  at¬ 
tributes,  and  links  are  thought  of  as  mtensional  entities,  in  the  sense  that, 
unlike  sets,  they  can  have  different  members,  or  better,  instances,  across 
time.  Intuitively,  the  instances  of  entity  classes  at  any  given  time  are  best 
thought  of  as  featureless  “pegs”  on  which  we  hang  clusters  of  information. 
A  good  model  for  an  instance  of  an  entity  class  might  be  an  internal  pointer 
within  a  computer’s  memory  (the  featureless  entity  itself)  that  points  to  a 
collection  of  records  on  disk  (the  clusters  of  information)  associated  with, 
say,  a  given  employee  in  a  company.  Since  we  may  keep  several  different 
clusters  of  information  on  a  single  real-world  individual — for  instance,  the 
records  on  that  individual  in  the  role  of  an  employee,  and  the  records  on 
that  same  individual  in  the  role  of  a  secretary — we  think  of  all  the  entity 
classes  as  disjoint. 

Attributes  are  (intensional)  functions  from  entity  class  instances  to  at¬ 
tribute  values.  Intuitively,  an  attribute — SALARY-OF,  for  example — takes 
an  instance  e  of  an  entity  class — the  class  of  employees,  say — to  the  value 
of  that  attribute  applied  to  e,  viz.,  in  this  case  the  salary  of  the  individual 
represented  by  e. 

In  the  definition  of  an  information  structure  one  associates  with  each 
entity  classes  a  set  (possibly  empty)  of  attributes  designated  to  be  the  ones 
owned  by  that  entity  class.  For  example,  the  deparment  entity  class  might 
own  the  attribute  DEPT.NUM.OF,  the  employee  entity  class  the  attributes 
EMPLOYEE.NUM.OF  and  WORKS. IN,  and  the  secretaries  entity  class 
might  own  the  attribute  T  Y PING  .SPEED  _  OF. 
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Links  are  functions  from  entity  class  instances  tc  entity  class  instances. 
That  is,  a  link  associates  each  instance  of  a  given  entity  class  with  an  in¬ 
stance  of  another  (possibly  the  same)  entity  class.  Thus,  for  example,  the 
link  WORKS-IN  maps  each  instance  e  of  the  employees  entity  class  to  the 
department  instance  that  e  works  for.  Links  come  in  three  flavors:  one-to- 
one,  strong  many-to-one,  and  weak  many-to-one.  To  illustrate  these,  suppose 
that  E  and  E'  are  entity  classes  in  an  information  structure,  _nd  that  Z  is  a 
link  from  E  to  E'.  Then  Z  is  one-to-one  if  no  two  distinct  instances  of  E  can 
possibly  be  mapped  by  Z  to  the  same  instance  of  E'.  I  is  strong  many-to-one 
if  it  is  not  one-to-one  and,  necessarily,  every  element  of  E'  has  at  least  one 
instance  of  E  mapped  to  it  by  Z.  And  Z  is  weak  many-to-one  if  it  is  of  nei¬ 
ther  of  the  above  two  kinds.  Note  that  if  Z  is  neither  one-to-one  nor  strong 
many-to-one,  then  every  instance  of  E'  always  has  zero  or  more  dements  of 
E  mapped  to  it  by  Z. 

Since  links  are  functions,  they  can  often  be  composed  to  forge  new  links 
between  entity  dasses.  Suppose  we  have  a  one-to-one  link  WORKS 
FOR  between  the  secretary  entity  dass  and  the  employee  entity  dass  to 
indicate  the  link  between  (the  duster  of  information  we  keep  on)  secretaries 
and  (the  duster  of  information  we  keep  on)  the  employees  they  work  for. 
Then  by  composing  this  link  with  the  link  WORKS  JN,  we  have  a  new  link 
WORKS  -IN  •  WORKS  FOR  from  secretary  to  department,  viz.,  the  link 
that  maps  the  information  about  a  given  secretary  to  the  department  his  or 
her  boss  works  for. 

Since  attributes  are  also  functions,  we  can  compose  them  with  links  to 
generate  new  attributes.  For  example,  if  we  compose  the  link  WORKS-IN 
with  the  attribute  DEPT.NUM-OF  that  is  owned  by  the  entity  dass  de¬ 
partment,  we  have  a  new  attribute  DEPT -NUM -OF  •  WORKS  -IN  that 
maps  each  employee  to  the  department  number  of  the  department  he  or  she 
works  for.  The  new  attribute  DEPT  -NUM -OF  •  WORKS  -IN  now  associ¬ 
ated  with  employee  is  said  to  be  an  inherited  attribute  in  employee,  and 
we  say  that  employee  inherits  the  owned  attribute  DEPT -NUM -OF  from 
the  entity  dass  department  down  the  link  WORKS -IN.  Finally,  we  say  that 
the  inherited  attribute  DEPT -NUM -OF  •  WORKS  -IN  is  derived  from  the 
attribute  DEPT -NUM -OF 

Certain  collections  of  the  attributes — both  owned  and  inherited — associ- 
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ated  with  a  given  entity  class  a-.«_  always  able  to  distinguish  every  member 
of  the  class  from  every  other.  A  collection  of  attributes  that  does  so  in  every 
possible  instantiation  of  the  class  and  which  does  not  contain  any  unneces¬ 
sary  attributes  for  that  purpose  is  called  a  key  class.  Suppose  employees  in 
different  departments  can  have  the  same  employee  number,  but  employees 
in  the  same  department  cannot.  Then  the  class  consisting  of  the  attributes 
DEPT -NUM  OF  •  WORKS  JN  and  EMPLOYEE -NUM -OF  constitute  a 
key  class  for  the  employee  entity  class.  If  we  were  to  add  SALARY -OF 
to  this  class,  it  would  still  perform  the  same  individuating  function,  but  it 
would  not  be  a  key  class,  since  the  added  attribute  is  unnecessary  to  this 
function. 

Since  the  information  we  keep  about  objects,  represented  in  their  at¬ 
tribute  values,  is  usually  the  only  way  to  distinguish  them,  every  entity  class 
must  have  at  least  one  associated  key  class.  In  addition,  the  following  condi¬ 
tions  are  required:  (i)  if  l  links  the  entity  classes  E  and  E' ,  then  E  inherits 
from  E'  all  the  attributes  in  some  key  class  of  E'  down  I;  and  (it)  if  I  is  a  one- 
to-one  link  from  E  to  E',  then  the  inherited  attributes  of  E  that  are  derived 
from  the  attributes  of  E'  that  E  inherits  from  E'  down  l  themselves  form  a 
key  class  of  E.  The  idea  behind  (t)  is  this:  suppose  that  entity  class  E  is 
linked  to  E\  and  that  instance  e  is  mapped  to  instance  e!  by  this  link.  Then 
all  the  information  associated  with  e'  becomes  thereby  associated  with  e  in 
virtue  of  the  link  between  them.  The  idea  behind  (it)  is  that,  if  in  addition 
the  link  is  one-to-one,  so  that  no  other  instance  of  E  besides  e  is  linked  to 
e',  then  the  information  in  any  key  class  of  E'  that  distinguishes  e'  from  all 
other  possible  instances  of  E'  also  must  distinguish  e  from  all  other  possible 
instances  of  E. 

It  will  be  useful  to  the  reader  at  this  point  to  see  how  these  informal  ideas 
are  explicated  formally  in  the  formal  framework  in  the  appendix. 


6  Summary 

The  theory  we  have  developed  in  this  paper  has  several  purposes.  First,  it 
provides  a  language  for  model  specification.  That  is,  the  theory  can  be  used 
to  provide  rigorous  definitions  of  the  syntax  of  a  modeling  methodology — so 
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that  it  is  wholly  clear  exactly  what  constructs  are  permissible  in  the  method¬ 
ology  and  what  are  not — and  a  precise  account  of  its  semantics — so  that  mod¬ 
elers  have  a  dear  vision  of  the  sorts  of  structures  they  are  to  be  identifying 
and  modeling  with  the  methodology  in  question. 

Second,  the  theory  provides  a  broad  and  expressively  powerful  language 
that  can  be  used  to  supplement  any  given  methodology  by  enabling  it  to 
describe  and  express  constraints  otherwise  inexpresssible  in  the  methodology 
proper.  We  saw  examples  of  this  above.  This  function  of  the  theory  can  also 
be  useful  in  the  design  or  modification  phase  of  a  given  methodology,  in  that 
it  can  point  out  clearly  the  logical  form  of  the  sorts  of  information  that  one 
wishes  to  capture  within  the  methodology. 

Finally,  the  theory  is  powerful  enough  to  capture  the  information  content 
of  any  model  within  any  existing  methodology — IDEFl,  IDEFl-X,  EN ALIM, 18 
ER,19  etc. — and  also,  we  believe,  any  likely  model  as  well.  It  thus  serves  as 
a  foundation  for  the  construction  of  a  Neutral  Information  Representation 
Scheme  which  has  the  capability  of  capturing  information  from  a  model  de¬ 
veloped  using  one  type  of  methodology  and  transferring  it — as  faithfully  as 
possible — to  a  model  constructed  from  another  type  of  methodology.  We 
are  to  the  point  where  we  can  begin  thinking  directly  about  the  sorts  of  al¬ 
gorithms  and  heuristics  that  will  be  needed  to  carry  out  such  a  task.  The 
framework  here  provides  the  necessary  medium. 


l*I.e.,  Enhanced  Natural  Language  Information  Modeling  Method. 
19I.e.,  Entity-Relationship  modeling  method. 
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A  An  Overview  of  IDEF1 


Before  attempting  any  of  the  other  chapters  in  this  report  the  Integrated 
Computed  Aided  Manufacturing  (ICAM)  DEFinition  (IDEF)  language,  IDEF1 
must  be  understood.  IDEF1  has  a  simple  and  clean  syntax  which  can  be  un¬ 
derstood  quickly.  On  the  other  hand,  there  is  an  art  to  modeling  in  any 
methodology.  IDEFl’s  design  makes  it  imperative  that  the  modeler  under¬ 
stand  proper  modeling  discipline. 

As  in  each  of  the  following  chapters,  this  chapter  will  begin  with  a  dis¬ 
cussion  of  IDEFl’s  history  and  purpose  and  then  move  onto  its  syntax  and 
semantics.  Those  familiar  with  the  methodologies  may  not  need  to  read  the 
syntax  and  semantics  sections,  but  keep  in  mind  that  many  methodologies 
have  several  dialects.  In  order  to  understand  the  metamodels,  it  is  important 
that  the  reader  understand  which  dialect  is  being  modeled.  In  general,  the 
original  definitions  of  methodologies  are  strictly  adhered  to. 

A.l  History  and  Purpose 

The  family  of  IDEF  methodologies  is  meant  to  provide 'methods  and  lan¬ 
guages  for  discovery,  representation,  and  consensus  development  of  the  views 
of  an  enterprise  necessary  to  allow  for  planning  and  design  of  integrated  infor¬ 
mation  systems.  That  is,  the  IDEF  methodologies  were  specifically  developed 
for  supporting  the  domain  experts  and  systems  analysts  in  gathering  infor¬ 
mation  about  the  existing  environment  and  achieving  consensus  within  the 
environment  relative  to  those  descriptions.  IDEFO  was  developed  to  model 
the  decisions,  actions,  and  activities  within  a  domain  and  the  relationships 
among  those  activities.  IDEFl  provides  the  methods  for  discovery  and  rep¬ 
resentation  of  the  logical  structure  and  relations  between  basic  information 
groups  actually  managed  by  an  organization.  IDEF2  provides  a  method  for 
development  of  quantitative  simulation  models  that  allow  the  study  of  time 
varying  behavior  of  a  system  that  is  stochastic  in  nature.  IDEF3  supports 
the  direct  capture  of  domain  experts  descriptions  of  process  flow  and  object- 
state  transitions.  IDEF5  is  under  development  to  support  the  capture  and 
representation  of  domain  knowledge,  concepts,  and  terminology  (sometimes 
referred  to  as  domain  ontologies).  IDEFlX  was  the  first  IDEF  methodology 
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to  focus  on  support  of  system  design  activities.  IDEFlX  data  incorporates 
criteria  for  efficient  conceptual  schema  design.  IDEF4  was  developed  later 
to  support  the  design  of  object-oriented  systems,  particularly  systems  en¬ 
compassing  the  use  of  object  oriented  databases.  As  a  family,  the  IDEF 
methodologies  provide  the  modeler  with  the  ability  to  concentrate  on  views 
of  an  enterprise  without  using  a  sledge  hammer  methodology  meant  to  model 
all  views. 

IDEFl  models  the  information  managed  within  a  system,  though  closely 
related  to  IDEFlX  it  is  not  a  subset  of  IDEFlX.  IDEFl  and  IDEFlX  are 
similar,  but  by  providing  a  methodology  for  data  modeling  and  consequently 
conceptual  schema  database  design,  the  developers  of  IDEFlX  added  con¬ 
structs  which  cloud  the  distinction  between  data  which  is  kept  about  objects 
and  the  objects  themselves.  This  was  necessary  since  a  conceptual  schema 
by  definition  is  a  type  of  data  dictionary  (albeit  a  complex  on-line  dictio¬ 
nary  used  to  provide  both  access  and  control  to  distributed  electronic  het¬ 
erogeneous  databases).  Thus,  a  conceptual  schema  designer  must  develop  a 
structure  that  can  both  contain  the  data  objects  and  the  information  about 
those  data  object  (such  as  their  physical  system  location).  IDEFl  however, 
was  designed  to  be  both  more  general  and  less  committed  to  any  particu¬ 
lar  implementation  concept.  In  a  properly  developed  IDEFl  model  there 
should  never  be  any  misconceptions,  only  the  information  kept  within  an 
organization  about  objects  (physical,  abstract  or  data)  is  being  modeled. 

IDEFl  entities  need  not  correspond  directly  to  any  particular  object  in 
the  real  world,  the  IDEFl  model  represents  the  modeler’s  analysis  results. 
The  analysis  method  results  in  a  reconstruction  of  the  underlying  structure 
and  grouping  of  the  information  actually  managed.  In  the  real  world  these 
logical  groups  of  attributes  may  be  distributed  over  many  data  artifacts. 
Also,  since  data  can  be  kept  by  the  organization  about  any  object,  (physical, 
abstract  or  data)  this  flexibility  is  necessary  when  attempting  to  establish 
information  requirements.  However,  it  is  not  constraining  enough  when  doing 
database  design  (hence  the  need  for  IDEFlX,  IDEF4,  Entity  Relationship 
(ER)  and  other  design  methods. 

As  with  any  of  the  IDEF  methodologies,  IDEFl  has  primarily  been  used 
by  defense  contractors  under  contract  to  the  Air  Force.  Hughes  has  a  propri¬ 
etary  version  of  IDEFl  called  ELKA  (Entity  Link  Key  Attribute).  IDEFl’s 
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connection  with  defense  projects  is  good  in  that  a  strong  underlying  analysis 
method  has  been  developed  for  the  application  of  IDEFl  modeling.  With  the 
emergence  of  the  recognition  of  the  need  for  a  system  development  frame¬ 
work  of  methods  and  the  availability  of  low-cost  integrated  tools  for  IDEFl 
application,  we  can  expect  to  see  IDEFl  gain  more  widespread  usage. 

A.2  Syntax  and  Informal  Semantics 

A. 2.1  Basic  Syntax 

The  lexicon  of  the  IDEFl  language  syntax  consists  of  just  four  basic  symbols 
(see  Figure  ??): 

•  Labeled  boxes  denoting  entity  classes, 

•  Labeled  lines  with  five  different  types  of  diamond  shaped  terminators 
denoting  relation  classes, 

•  Labels  inside  the  boxes  denoting  attribute  classes, 

•  Parenthesized  (or  underlined)  sets  of  labels  denoting  key  classes. 


A.2. 2  Entity  Class,  Attribute  Class,  and  Key  Class 

The  concept  of  an  entity  class  is  meant  to  capture  the  notion  of  a  basic 
information  structure  the  extension  of  which  at  any  point  in  time  is  a  set  of 
informational  items  called  entities.  A  basic  concept  behind  the  notion  of  an 
entity  is  that: 

•  they  are  persistent  (i.e.  the  organization  expends  the  resources  (time, 
money,  equipment  or  facilities)  to  observe,  encode,  record,  organize  and 
store  the  existence  of  individual  entities), 

•  they  can  be  individuated  (i.e.  they  can  be  identified  uniquely  from 
other  entities). 
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Symbol  denoting 
an  entity  class 


Entity  Class  Label 


Symbols  denoting 
attribute  classes 


Figure  1:  IDEF1  Graphical  Lexicon 
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Figure  2:  Card  file  interpretation  of  an  IDEF1  entity  class. 

The  IDEFl  language  does  not  provide  a  means  of  representing  the  indi¬ 
vidual  entities  only  groups  of  entities  which  share  exactly  the  same  types  of 
attributes.  These  groups  from  an  IDEFl  view  are  called  classes.  A  useful 
memory  aid  for  this  notion  is  to  think  of  the  entity  class  as  a  layout  for  a 
card  file  (See  Figure  2  ).  An  entity  class  has  a  name  and  a  unique  iden¬ 
tification  number  associated  with  it,  along  with  a  glossary  entry  and  a  list 
of  synonyms.  An  entity  class  is  represented  by  a  rectangular  box  with  the 
label  of  the  entity  class  located  in  the  lower  left  corner  of  the  entity  class 
surrounded  by  a  smaller  rectangle  and  with  the  entity  class  number  located 
in  the  lower  right  corner  of  the  larger  box. 

An  entity  class  is  actually  defined  by  the  set  of  attribute  classes  that 
define  the  characteristics  of  all  the  possible  entities  in  all  of  its  extensions. 
It  is  important  to  note  that  the  set  of  attributes  is  more  important  than  the 
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Figure  3:  Bucket  analogy 

notion  conveyed  by  the  label  on  the  entity  class  name!  In  other  words,  one 
can  think  of  the  entity  class  as  simply  a  labeled  bucket  with  no  meaning 
beyond  that  of  the  collection  of  attribute  classes  it  contains  (see  Figure  3). 
In  fact,  it  is  considered  good  practice  to  use  an  entity  class  label  that  does 
not  name  a  physical  or  data  object  in  the  domain  since  that  could  confuse  an 
uninformed  reader.  The  labels  of  the  attribute  classes  that  define  an  entity 
class  are  simply  listed  in  the  entity  class  box  below  the  key  class  designators 
and  above  the  entity  class  label. 

The  occurrence  of  the  same  attribute  class  in  multiple  entity  class  defini¬ 
tions  defines  a  relationship  between  those  entity  classes.  In  order  to  establish 
the  existence  dependency  between  such  entity  classes,  one  entity  class  must 
be  determined  to  be  the  owner  of  the  shared  attribute  class.  Every  attribute 
class  that  ends  up  being  a  part  of  an  IDEF1  model  has  exactly  one  owner 
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Figure  4:  Example  of  the  No-Null  rule. 

entity  class.  When  deciding  on  the  addition  of  an  attribute  class  to  an  entity 
class,  two  rules  must  be  followed.  The  first  is  referred  to  as  the  No-Null  Rule. 
This  rule  states  that  no  member  of  an  entity  class  can  take  a  null  value  for 
its  attribute  that  corresponds  to  the  added  attribute  class  (Figure  4  ). 

The  second  rule,  the  No- Repeat  rule,  states  that  no  member  of  an  entity 
class  can  take  more  than  one  value  at  a  time  for  its  attribute  that  corresponds 
to  the  added  attribute  class  (Figure  5  ). 

Each  entity  class  has  associated  with  it  at  least  one  key  class.  A  key 
class  is  just  a  special  subset  of  the  attribute  classes  which  define  the  entity 
class.  What  makes  such  key  class  subsets  special  is  that  it  can  be  determined 
that  for  any  instance,  the  values  of  the  attributes  of  that  instance  (which 
correspond  to  the  attribute  classes  in  a  key  class),  collectively,  will  uniquely 
identify  that  instance  of  the  entity  class  from  all  other  instances.  In  an  IDEF1 
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diagram,  the  key  class  subsets  are  located  in  the  upper  left  corner  of  the  entity 
class  for  which  the  key  class  is  being  defined.  Key  classes  are  not  named  or 
labeled,  a  key  class  is  denoted  by  enclosing  the  subset  of  attribute  classes 
that  make  up  the  key  class  in  parentheses  or  by  underlining  the  subset.  In 
the  metamodels  of  this  report  we  will  always  use  the  parenthesis  convention. 
It  should  be  noted  that  entity  classes  are  allowed  to  have  multiple  key  classes. 
The  multiple  key  classes  would  reflect  multiple  ways  of  identifying  an  entity 
class  instance.  For  example,  in  a  model  of  a  typical  business  environment, 
an  instance  of  an  EMPL  entity  class  might  have  multiple  key  classes.  The 
first  would  consist  of  the  employee’s  name  in  combination  with  an  employee 
number.  The  second  key  class  may  consist  only  of  the  employee’s  Social 
Security  Number.  In  both  cases,  an  EMPL  entity  class  instance  could  be 
uniquely  identified  by  either  key  class  (see  insert  for  example). 

A. 2. 3  Link  (or  Relation)  Classes 

A  link  is  a  binary  relationship  that  exists  between  two  entities  established 
/  the  sharing  of  a  common  attribute(s)  which  must  assume  the  exact  same 
value  in  each  of  the  two  entities  involved  in  the  link.  In  IDEF1  the  gen¬ 
eralization  of  all  such  links  involving  instances  of  the  same  two  classes  of 
entities  and  the  same  shared  class(es)  of  attribute(s)  is  called  a  link  type,  or 
(more  traditionally)  link  class.  A  link  class  establishes  a  binary  relationship 
between  two  entity  classes  that  share  a  common  attribute  class.  A  link  class 
is  represented  by  a  line  running  between  the  boxes  of  the  two  entity  classes. 
A  label,  representing  the  name  of  the  link  class,  is  displayed  over  the  line 
representing  the  link.  Because  of  the  attribute  class  ownership  property,  a 
link  indicates  a  dependence  of  one  entity  class  on  the  other  entity  class.  The 
dependent  entity  class  is  considered  to  be  existent  dependent  since  a  mem¬ 
ber  of  that  entity  class  cannot  exist  unless  the  corresponding  member  of  the 
independent  entity  class  already  exists.  In  general  IDEFl  uses  links  to  rep¬ 
resent  common  types  of  organizational  constraints  (sometimes  referred  to  as 
business  rules)  on  the  information  that  is  managed.  It  should  be  noted  that 
not  all  of  the  business  rules  can  be  represented  with  the  standard  IDEFl  lan¬ 
guage  constructs.  In  another  report  we  describe  a  constraint  language  called 
the  Information  Systems  Constraint  Language  (ISyCL).  ISyCL  (pronounced 
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Figure  6:  One-to-zero-orone  Link  Class. 

icicle )  is  used  to  augment  the  standard  IDEF1  language  as  needed  in  this 
report  to  capture  some  of  the  more  complex  rules  of  individual  methods. 

A  link  class  also  has  a  cardinality  associated  with  it,  specifying  the  num¬ 
ber  of  members  of  each  entity  class  that  can  be  involved  in  a  relationship 
with  a  single  member  of  the  other  entity  class.  Figure  6  shows  the  syntac¬ 
tic  representation  of  a  one-to-zero-or-one  (or,  thought  of  functionally  in  the 
other  direction,  one-to-one)  relationship. 

A  link  with  this  cardinality  represents  the  fact  that  one  member  of  the  in¬ 
dependent  entity  class  can  be  associated  with  zero  or  one  members  of  the 
dependent  entity  class.  However,  each  member  of  the  dependent  entity  class 
is  associated  with  one  and  only  one  member  of  the  independent  entity  class. 

Figure  7  shows  the  syntactic  representation  of  a  weak  one-to-many  (or 
functionally,  weak  many-to-one)  relationship. 

In  this  situation,  an  independent  entity  class  member  can  be  associated  with 
zero,  one,  or  many  dependent  entity  class  members.  Again,  each  member  of 
the  dependent  entity  class  is  associated  with  one  and  only  one  member  of 
the  independent  entity  class. 

Figure  8  shows  the  syntactic  representation  of  a  strong-one-to-many 
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Figure  7:  Weak  one-to-many  Link  Class. 

(functionally,  strong  many-to-one)  relationship. 

Here,  the  independent  entity  class  member  must  be  associated  with  at  least 
one  instance  of  the  dependent  entity  class  member.  Again,  each  member  of 
the  dependent  entity  class  is  associated  with  one  and  only  one  member  of 
the  independent  entity  class. 

Notice  that  IDEF1  does  not  allow  a  many-to-many  relationship  or  a  zero- 
or-one-to-zero-or-one  relationship  in  what  is  considered  a  final  model.  These 
relationships  make  the  dependency  situation  ambiguous.  The  resolution  of 
such  uncertain  situations  (which  often  arise  in  the  early  phases  of  the  cor¬ 
responding  analysis)  often  results  in  the  analyst  determination  that  the  sus¬ 
pected  relationship  is  unsupported  by  the  analysis  data.  Alternatively  the 
analyst  may  discover  additional  entity  class(es)  on  which  both  of  the  entity 
classes  involved  in  “many-to-many”  relationship  are  independent  (an  exam¬ 
ple  of  this  is  shown  in  Figure  9  ). 

Note  also  that,  when  specifying  a  one-to-many  link  class  (either  weak  or 
strong),  there  is  no  way  of  constraining  that  link  to  a  specific  upper  bound 
(for  example,  a  one  to  five  relationship).  Such  details  are  left  to  ISyCL  if 
considered  absolutely  necessary. 
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Figure  8:  Strong  one-to-many  Link  Class. 

A. 2.4  Inheritance 

Previously  we  noted  that  the  sharing  of  attribute  classes  between  two  entity 
classes  was  the  basis  for  declaring  the  existence  of  a  link  class  between  those 
entity  classes.  However,  link  classes  are  generally  suspected  (or  proposed) 
by  the  analyst  prior  to  the  discovery  of  exactly  which  attribute  classes  are 
shared.  IDEFl  also  places  certain  restrictions  on  which  attribute  classes 
may  be  (and  must  be)  shared  in  order  for  a  valid  link  class  to  be  defined. 
When  a  link  class  is  defined  between  two  entity  classes,  certain  information  is 
shared  between  those  entity  classes.  The  attribute  classes  that  make  up  the 
key  classes  of  the  independent  entity  class  must  become  attribute  classes  for 
the  dependent  entity  class.  It  is  possible  for  the  inherited  attribute  classes 
to  become  part  of  the  key  class  of  the  dependent  entity  class.  In  fact,  the 
attributes  must  become  part  of  the  key  class  when  a  link  class  has  a  one-to- 
zero-or-one  link  cardinality.  In  the  case  of  a  strong-one-to-many  relationship 
the  attributes  that  are  shared  cannot  make  up  a  key  that  would  be  a  subset 
of  the  key  of  the  independent  entity  class  from  which  they  came. 


Independent  entity  class  Dependent  entity  class  Independent  entity  class 


Figure  9:  Resolution  of  a  many-to-many  relation. 
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A  A  Formal  Account  of  Information  Structures^ 

In  this  appendix  we  will  make  use  of  our  formal  apparatus  a  little  more 
rigorously  to  give  a  general  definition  of  an  information  structure.  A  note 
before  we  continue.  Once  the  notion  of  a  formed  language  is  defined,  it  is 
often  easier  to  mingle  plain  English  with  the  constraint  language  for  easier 
readability.  The  only  important  point  is  that  anything  said  in  this  more 
informal  fashion  can  be  stated  if  need  be  in  a  purely  formal  way.  We  shall 
follow  this  practice  here. 

In  addition  to  the  usual  number  theoretic  and  set  theoretic  apparatus,  the 
elements  of  our  constraint  language  for  the  purpose  of  giving  a  general  defi¬ 
nition  of  information  structures  will  contain  a  raft  of  new  constants,  function 
symbols,  and  predicates.  These  are  highlighted  below  in  boldface  or  italic. 

Also,  since  the  distinction  between  object  language  and  metalanguage  should 
be  well  understood  by  now,  we  vill  revert  to  the  use  of  the  more  standard 
identity  predicate  =  in  the  object  language  here. 

A.l  Intensional  Information  Structures, 

An  intensional  information  structure  (IIS)  I  is  a  seven-tuple  (E,  BL,  OA,  V, 
CL,IA,F),  where 

•  E  is  a  finite  set  of  objects  known  as  ••  ity  classes , 

•  BL  =  U{Bir\BL*,BL0}  is  the  union  of  three  pairwise  disjoint  finite 
sets  of  objects  known  as  basic  link  classes  or  basic  link  types , 

•  OA  is  a  finite  set  of  objects  known  as  owned  attributes, 

•  V  is  a  set  of  sets  known  as  attribute  value  classes. 

•  CL  is  a  finite  set  of  objects  known  as  composite  link  classes  (types),  to 
be  described  below, 

•  IA  is  a  finite  set  of  objects  known  as  inherited  attributes,  and 
19This  work  was  partially  supported  by  a  grant  from  Tandem  Computer  Corporation. 
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•  F  =  {back,  front,  owner ,  target,  kc }  is  a  set  of  functions  described 
below. 

Intuitively,  entity  classes  are  the  basic  intensional  types  whose  instances  can 
appear  in  concrete  realizations,  or  instantiations,  of  an  IIS.  Basic  link  types 
are  functions  in  intension  that  map  the  entities  of  one  type  E — called  the 
back  of  the  link  type — to  the  entities  of  another  (possibly  the  same)  type 
E' — called  the  front  of  the  link  type.  And  owned  attributes  are  functions 
in  intension  that  map  the  entities  of  a  given  type  E — called  the  owner  of 
the  attribute — to  a  given  attribute  value  class  V — called  the  target  of  the 
attribute.  An  attribute  value  class  is  thus  to  be  thought  of  as  the  range  of 
possible  values  for  a  particular  owned  attribute.  Modeling  these  intuitive 
connections  is  the  job  of  the  first  four  functions  in  F.  Specifically, 

•  back  :  BL  — ►  E; 

•  front  :  BL  — ►  E; 

•  owner  :  OA  — ►  E; 

•  target  :  OA  — ►  V ; 

That  is.  the  function  back  maps  a  basic  link  type  l  G  BL  to  an  entity 
class  e  G  E,  i.e.,  e  =  back{l).  Similarly  for  the  other  functions.  To  enable 
us  to  use  more  traditional  functional  terminology,  we  define  the  functions 
domain  =  back  U  owner  and  codomain  =  front  U  target. 

It  is  easiest  to  model  the  intuitive  nature  of  composite  links — the  mem¬ 
bers  of  CL — as  finite  sequences  (i.e.,  n-tuples)  of  basic  links.  Call  any  such 
sequence  s  =  (li,...,ln)  happy  just  in  case  for  all  i  <  n  (z  >  0),  back(li)  = 
front (l i+i). 120  Then  CL  meets  the  condition 

20The  idea  is  that  a  happy  sequence  represents  a  chain  of  connected  link  types  such 
that  the  back  of  each  link  type  (save  the  one  beginning  the  chain)  is  .  the  front  of  the 
preceding  one  in  the  chain.  Now  in  fact,  the  actual  definition  here  represents  this  idea 
backwards:  the  intuitive  beginning  of  such  a  connected  chain  is  actually  as  defined  the  last 
member  in  the  formal  representation  (fi , . . . ,  /„).  However,  this  definition  mirrors  directly 
the  corresponding  IDEF1  syntax  for  such  chains,  and  hence  in  the  long  run  makes  for  a 
simpler  semantics. 
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Cl:  CL  C  {5  j  5  is  a  happy  sequence  of  basic  link  types  }. 

Given  this,  the  functions  back  and  front  can  be  extended  such  that, 

Def  6:  For  composite  links  L  =  (/a, . . . ,  /„), 

•  back(L)  =4/  back(ln)', 

•  front(L)  =df  front(li). 

The  definitions  of  domain  and  codomain  can  then  be  broadened  to  include 
these  newly  defined  extensions  in  the  obvious  way  as  well. 

Henceforth,  let  L  =  BL  U  CL.  The  composite  nature  of  composite  link 
types  can  be  highlighted  by  defining  an  operator  @  on  L  such  that, 

Def  7:  For  basic  links  l,  I' ,  and  composite  links  L,  V , 

•  l@l'  =dj  (1,1'),  if  (/,/')  €  CL;  otherwise  l@l'  is  undefined; 

•  l@L  (l)  L,  if  {/)  L  €  CL;  otherwise  l@L  is  undefined;21 

•  L@l  =df  L  ^  (l),  if  L  (/)  6  CL;  otherwise  L@l  is  undefined; 

•  L@L'  =dj  L  L',  if  L  V  €  CL;  otherwise  L@L'  is  undefined. 

Informally,  then,  A’QK  signifies  the  composition  of  the  link  type  X  with  the 
link  type  Y. 

Intuitively,  an  inherited  attribute  is  the  composition  of  a  link  type  with 
an  owned  attribute.  Thus,  modeling  composition  in  terms  of  sequences  as 
we  are,  we  specify  that  the  set  IA  of  inherited  attributes  meet  the  condition 

C2:  IA  C  {X  |  for  some  a  €  OA  either  for  some  /  €  BL,  X  =  (a,  l ),  or  for 
some  L  €  CL,  X  =  (a)  ^  L}. 

That  is,  a  member  of  IA  must  be  either,  in  the  simplest  case,  a  pair  consisting 
of  an  owned  attribute  (i.e.,  a  member  of  OA)  and  a  basic  link  type  (i.e.,  a 

21Where  s  ^  s'  is  concatenation,  i.e.,  the  result  of  tacking  the  sequence  s'  onto  the  end 
of  the  sequence  s. 
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member  of  BL),  or  else  the  result  of  tacking  an  owned  attribute  onto  the 
beginning  of  a  composite  link  type. 

We  then  extend  the  definition  of  @  such  that, 

Def  8:  For  a  6  OA,  /  6  BL,  L  €  CL, 

•  a@l  (a,/),  if  (a,  0  €  IA,  and  undefined  otherwise; 

•  a@L  =df  (a)  L,  if  (a)  ^  L  €  LA,  and  undefined  otherwise. 

Given  this,  we  have 

Def  9:  For  any  inherited  attribute  A  =  a@L, 

•  owned-attr(A )  =#  a, 

•  link(A)  =df  L. 

We  can  then  extend  the  definition  of  the  function  owner  to  a  function  g- 
owner  (for  “generalized  owner”)  on  the  set  of  all  attributes  A  =  OA  U  LA 
such  that, 

* 

Def  10:  For  a  €  OA,  g-owner(a )  =#  owner(a);  for  A  6  IA,  g-owner(A)  =# 
back  (link  (A)). 

That  is,  the  g-owner  of  a  given  owned  or  inherited  attribute,  viewed  as  a 
function,  is  its  domain. 

The  last  element  kc  of  F,  is  a  function  from  entity  classes  e  to  sets  of 
subsets  of  A-intuitively,  the  key  classes  of  e-that  meets  the  following  condi¬ 
tions: 

C3:  For  all  E  €  E,  kc(E)  ^  0, 

that  is,  the  set  of  key  classes  for  any  given  entity  class  must  be  nonempty, 
i.e.,  every  entity  class  must  have  at  least  one  key  class. 

C4:  For  all  E  6  E,  and  for  all  K,  K'  €  kc(E),  K  £  K'.22 

22C,  recall,  signifies  the  proper  subset  relation.  Note  that  this  condition  rules  out  the 
possibility  of  an  empty  key  class,  since  the  empty  set  is  a  subset  of  every  set,  including 
itself. 
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C5:  For  all  E  £  E,  and  for  all  A  £  J  kc(E),  g-owner(A )  =  E, 


that  is,  the  attributes  in  every  key  class  of  a  given  entity  class  E  must  be 
owned  by  E ,  i.e.,  have  E  as  their  domain. 

Now  we  define  the  important  notion  of  a  walk  and  related  concepts.  These 
will  be  used  most  directly  to  define  information  structures. 

Def  11:  Let  £  =  ( Ei,...,En )  be  a  sequence  of  entity  classes,  let  A  = 
(li, . . . ,  /n-i)  be  a  sequence  of  basic  link  types,  and  let  W  =  (£,  A).  Then 

•  W  is  a  walk  (from  E\  to  En)  iff  for  all  i  <  n,  back(l ,)  =  Et  and 
front(li )  =  Ei+i,  or  back(U)  =  Ei+i  and  front(li)  =  j 

•  If  W  is  a  walk,  then  P  is  increasing  iff  for  all  i  <  n,  if  6ac£(/,)  =  Ei , 
then  /,•  6  BL~*,  and  if  back(li)  =  Ei+ 1,  then  /,•  £  BL*. 

•  W  is  cyclic  iff  Ei  =  En. 

•  2  is  connected  iff  for  all  distinct  E ,  E'  €  E,  there  is  a  walk  from  E  to 
E'. 

A  walk,  that  is,  intuitively,  is  a  sequence  of  entity  classes  such  that  each 
(save  the  last)  Ei  is  connected  to  its  successor  E,+i  by  a  link  type  either 
in  one  direction  or  the  other.  An  increasing  walk  is  one  such  that,  when  you 
traverse  the  link  types  in  a  walk  from  E\  to  En,  there  can  be  no  decrease  in 
cardinality  as  you  move  from  the  extension  of  one  entity  class  (see  paragraph 
on  information  structure  realizations  below)  to  that  of  the  next.  The  final 
two  notions  are  self-explanatory. 

Given  this  apparatus,  we  can  state  the  last  conditions  on  1: 

C6:  X  is  connected. 

C7:  X  contains  no  increasing  cyclic  walks. 

C8:  For  all  /  €  BL,  there  is  some  K  £  kc(front{l))  such  that  for  all  A  £ 
K ,  A@l  £  IA;23  if  /  happens  to  be  1-1,  i.e.,  if  /  £  BL-*,  then  in  addition 
{A@l  j  A  £  K }  £  kc(g-owner(A@l)). 

23I.e.,  informally,  if  E  is  linked  to  E'  via  l,  then  all  the  attributes  A  in  some  key  class  of 
E'  are  inherited  into  E,  i.e.,  A@l  €  IA  so  that  g-owncr(A@l)  =  E  and  g-owntr^A)  =  E' . 
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C8  captures  the  conditions  on  key  classes  noted  in  the  final  paragraph  of  the 
previous  section. 


A. 2  Information  Structures  and  their  Realizations 

A  complete  realization  of  an  IIS  X  is  a  4-tuple  (X,  W,  D,  ext),  where  W  is  a 
set  of  indices  (intuitively,  the  set  of  all  possible  realizations  of  X),  D  is  a  set 
of  objects  (intuitively,  the  set  of  all  possible  instances  of  all  the  entity  classes 
in  E)  and  ext  is  a  function  that,  for  each  index  w  G  W,  maps  elements  of 
E  U  OA  U  BF  into  objects  of  the  appropriate  sort  a s  follows: 

C9:  For  each  E  6  E,  ex t(w,E)  C  D. 

CIO:  For  all  E,  E'  €  E,  and  for  all  tv  G  W,  if  E  ^  E',  then  ex t(w,E)  D 
ext{w,E')  =  0. 

Cll:  For  each  A  G  OA,  ext(w,  A)  G  {/  |  /  :  ext  (w,  owner  (A))  — ► 
target(A)}. 

C12:  For  each  /  G  BL, 

•  if  /  G  BL~*,then  ext(w,L )  G  {/  |  /  :  ext (w,  back (l))  ext(w,  front (/))} . 

•  if  /  G  BL*,  then  ext(w,L)  G  {/  j  /  :  ext (w,  back (l))  ext(w,front(l ))}.24 

•  if  /  G  BL°,  then  ext(w,  L)  G  {/  |  /  :  ext(w,  back(l))  — *  ext(w,front(l))}. 

Though  in  any  given  realization  the  extension  of  a  member  of  BL-’  might 
also  be  onto,  that  of  a  member  of  BL*  might  also  be  one-to-one,  and  that 
of  a  member  of  BL°  might  be  either  one-to-one  or  onto,  it  should  not  be 
possible  that  this  could  be  the  case  without  exception,  i.e.,  in  all  possible 
realizations.  Thus,  as  further  conditions  on  an  IIS  realization  we  have: 

C13:  For  all  /  G  BL, 

2<lWhere  /  is  onto  just  in  case  every  element  of  its  range  has  something  mapped  to 
it  from  its  domain.  The  addition  function  on  the  natural  numbers,  for  example,  is  onto 
(every  number  is  the  sum  of  two  numbers-itself  and  0  for  instance),  while  the  square 
function  is  not  (not  every  number  is  the  square  of  some  number). 
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•  If  /  €  BL-*  there  is  a  w  6  W  such  that  ext(w ,  /)  is  not  onto; 

•  If  /  €  BL*  there  is  a  w  6  W  such  that  ext(w,  l)  is  not  one-to-one; 

•  if  /  €  BL°  there  is  a  it;  €  W  such  that  ext  (tv,  l)  is  not  one-to-one  and 
a  tv'  €  W  such  that  ext(w,l)  is  not  onto. 

Note  that  these  conditions  cannot  be  enforced  in  a  database,  since,  e.g.,  there 
is  no  way  to  tell  whether  a  one-to-one  link  which  has  always  been  onto  will 
cease  being  so  with  the  next  entry.  For  instance,  by  coincidence,  it  might 
have  always  been  the  case  that  every  employee  in  a  company  has  one  child. 
Then  the  extension  of  the  link  type  from  employee-children  to  employee 
has  always  been  one-to-one,  despite  that  fact  that  this  could  change  as  soon 
as  any  employee  has  a  second  child  (supposing  this  is  not  prohibited  by 
company  policy).  The  constraints  above  are  thus  to  be  thought  of  as  design 
constraints  rather  than  descriptive  constraints,  and  are  important  in  the 
construction  phase  of  an  information  model  or  database. 

For  any  L  =  (lx,  €  CL,  ext(w,  L)  is  the  composition  of  the  exten¬ 

sions  of  the  link  types  /,•  at  tu,  i.e.,  ex t(w,L)  —  ext(w,li)  o  ...  o  ext(w,ln). 
Similarly,  where  A  is  an  inherited  attribute  a@L,  ext(w,A)  =  ext(w,o- 
att(A))  o  ext(w ,  link(A)). 

As  noted  above,  the  role  of  a  key  class  K  within  an  entity  class  E  is  to 
ensure  that  in  every  possible  realization  of  an  IIS,  the  instances  of  E  can  be 
distinguished  solely  in  terms  of  the  values  of  the  (extensions  of  the)  attributes 
in  K  in  that  realization.  This  is  expressed  formally  by  means  of  the  following 
constraint: 

C14:  For  all  E  Z  E,  for  all  I<  €  kc(E),  for  all  u>  €  W,  and  for  all  x,y  G 
ext(w,  E),  if  ext(w,  A){x)  =  ext(w,  A){y)  for  all  A  €  K,  then  x  =  y. 

As  also  noted,  key  classes  must  be  “minimal”  in  the  sense  no  proper  subset 
of  a  given  key  class  may  also  meet  C14;  this  is  expressed  as  follows: 

C15:  For  all  E  €  E,  for  all  I<  €  kc(E),  it  is  not  the  case  that  there  is 
an  5  C  K  such  that  for  all  w  €  W  and  for  all  x,y  €  ext(w,E),  if  for  all 
A  €  S,  ext(w,A)(x)  =  ext(w,  A)(y),  then  x  =  y. 
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Like  the  conditions  on  basic  links  across  possible  realizations  above,  and  for 
the  same  sorts  of  reasons,  C15  also  cannot  be  enforced  on  a  database. 
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